I have a word list like['like','Python']
and I want to load pre-trained Glove word vectors of these words, but the Glove file is too large, is there any fast way to do it?
What I tried
I iterated through each line of the file to see if the word is in the list and add it to a dict if True. But this method is a little slow.
def readWordEmbeddingVector(Wrd):
f = open('glove.twitter.27B/glove.twitter.27B.200d.txt','r')
words = []
a = f.readline()
while a!= '':
vector = a.split()
if vector[0] in Wrd:
words.append(vector)
Wrd.remove(vector[0])
a = f.readline()
f.close()
words_vector = pd.DataFrame(words).set_index(0).astype('float')
return words_vector
I also tried below, but it loaded the whole file instead of vectors I need
gensim.models.keyedvectors.KeyedVectors.load_word2vec_format('word2vec.twitter.27B.200d.txt')
What I want
Method like gensim.models.keyedvectors.KeyedVectors.load_word2vec_format
but I can set a word list to load.