I am trying to write a java method that replicates python FeatureHasher into Java alternative.
https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.FeatureHasher.html
Below is the python code.
>>> from sklearn.feature_extraction import FeatureHasher
>>> h = FeatureHasher(n_features=10)
>>> D = [{'dog': 1, 'cat':2, 'elephant':4},{'dog': 2, 'run': 5}]
>>> f = h.transform(D)
>>> f.toarray()
array([[ 0., 0., -4., -1., 0., 0., 0., 0., 0., 2.],
[ 0., 0., 0., -2., -5., 0., 0., 0., 0., 0.]])
I am using guava library (guava:29.0-jre) to mimic the above mentioned transformation using below code, however after using murmurhash3, java code returns a byte array. My requirement is to create a sparse metrics like above python code result.
Here is the java code:
byte[] bytes = Hashing.murmur3_128(16384).hashString("com.xyz.ad.demo", UTF_8).asBytes();
How do I generate a sparse metrics using this guava library?