Guava Hash 裡面有幾個重要的類別:Hashing, HashFunction, Hasher, HashCode, Funnel, PrimitiveSink
HashFunction
HashFunction 是 stateless function,可將任意的 data 轉換為 fixed number of bits,相同的 input 會得到相同的 output 結果,不同的 input 得到不同的唯一的結果
可產生 Hasher 物件,也可以根據條件回傳 HashCode 結果
@Test
public void hashFunction() {
HashFunction hf = Hashing.md5();
HashCode hc = hf.newHasher()
.putLong(10)
.putString("abc", Charsets.UTF_8)
.hash();
System.out.println("hashFunction="+hc.toString());
}
Hasher
HashFunction 可以使用 stateful Hasher,Hasher 提供 fluent syntax,可將資料填充到 Hash 裡面,並得到 hash value。Hasher 可接受 primitive input, byte array, slices of byte arrays, character sequences, 任意物件產生的 Funnel。
資料透過 putXXX() method 放到 Hasher 的資料源中,有 putByte, putBytes, putShort, putInt, putLong, putFloat, putdouble, putBoolean, putChar, putEnencodedChars, putString, putObject
Funnel
如果有一個類別
class Person {
final int id;
final String firstName;
final String lastName;
final int birthYear;
}
其 Funnel 會是
Funnel<Person> personFunnel = new Funnel<Person>() {
@Override
public void funnel(Person person, PrimitiveSink into) {
into
.putInt(person.id)
.putString(person.firstName, Charsets.UTF_8)
.putString(person.lastName, Charsets.UTF_8)
.putInt(birthYear);
}
};
組合起來的code
@Test
public void objectToHashCode() {
// 需要hash的对象
Person person = new Person(27, "John", "Wink", 1990);
Funnel<Person> personFunnel = new Funnel<Person>() {
@Override
public void funnel(Person person, PrimitiveSink into) {
into
.putInt(person.id)
.putString(person.firstName, Charsets.UTF_8)
.putString(person.lastName, Charsets.UTF_8)
.putInt(person.birthYear);
}
};
// md5算法
HashFunction hf = Hashing.md5();
HashCode hc = hf.newHasher()
.putString("abc", Charsets.UTF_8)
.putObject(person, personFunnel)
.hash();
System.out.println("objectToHashCode="+hc.toString());
}
class Person {
int id;
String firstName;
String lastName;
int birthYear;
public Person(int id, String firstName, String lastName, int birthYear) {
this.id = id;
this.firstName = firstName;
this.lastName = lastName;
this.birthYear = birthYear;
}
}
HashCode
當 Hasher 給了所有 input 資料,他的 hash() method 會產生 HashCode 結果
HashCode 支援 euqality testing 例如 asInt(), asLong, asBytes()
另外有 writeBytesTo(array, offset, maxLength) 可將 hash 的 maxLength bytes 寫入 array
BloomFilter
可偵測物件是否 "definitely" 不在 filter 裡面,或是有可能已經被加入 Bloom filter
對於 hash 來說,就是偵測是否 hash value 有碰撞,不同的 input 得到相同的結果
BloomFilter<Person> friends = BloomFilter.create(personFunnel, 500, 0.01);
for (Person friend : friendsList) {
friends.put(friend);
}
// much later
if (friends.mightContain(someone)) {
// the probability that someone reached this place if they aren't a friend is
// 1% we might, for example, start asynchronously loading things for someone
// while we do a more expensive exact check
}
Hashing
Hashing (Guava: Google Core Libraries for Java HEAD-jre-SNAPSHOT API)
裡面有 static method,可取得某些演算法的 HashFunction
adler32()
crc32(), crc32c()
farmHashFingerprint64()
fingerprint2011()
goodFastHash(int minimumBits)
hmacMd5
hmacSha1
hmacSha256
hmacSha512
md5()
murmur3_128(), murmur3_32()
sha1(), sha256(), sha384(), sha512()
sipHash24
@Test
public void hashing() {
// 測試 Hasing 演算法
String input = "hello, world";
// MD5
System.out.println("md5="+Hashing.md5().hashBytes(input.getBytes()).toString());
// sha256
System.out.println("sha256="+Hashing.sha256().hashBytes(input.getBytes()).toString());
// sha512
System.out.println("sha512="+Hashing.sha512().hashBytes(input.getBytes()).toString());
// crc32
System.out.println("crc32="+Hashing.crc32().hashBytes(input.getBytes()).toString());
// MD5
System.out.println("MD5="+Hashing.md5().hashUnencodedChars(input).toString());
}
沒有留言:
張貼留言