I have a trained customised fasttext model (fasttext is a word embedding algorithm developed by Facebook). I managed to get the expected result in a function but now I want to rewrite it into a customised transformer so I can add it into my sklearn pipeline as it only accepts transformer.
The function takes a word and returns vectors of the word:
def name2vector(name=None):
vec = [np.array(model.get_word_vector(w)) for w in name.lower().split(' ')]
name_vec = np.sum(vec, axis=0) # If "name" is multiple words, sum the vectors
return (name_vec)
returned value:
array([-0.01087821, 0.01030535, -0.01402427, 0.0310982 , 0.08786983,
-0.00404521, -0.03286128, -0.00842709, 0.03934859, -0.02717219,
0.01151722, -0.03253938, -0.02435859, 0.03330994, -0.03696496], dtype=float32))
I want the tranformer does the same thing as the function.
I know I can use BaseEstimator
and TransformerMixin
to rewrite it into a transformer by reading the tutorial but I still stuck on this. Some suggestions will be great, thanks.