ValueError: could not convert string to float: '.' during glove embedding

Question

I'm trying to encode word vectors using Glove and I get the error stated above. The data consists of two text columns for the purpose of sentence similarity determination. Can you please help me solve this error?

[code]

embeddings_index = {}
f = open(r'C:\Users\15084\Downloads\glove.840B.300d\glove.840B.300d.txt',errors = 
'ignore',encoding='utf-8')
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs
f.close()

print('Found %s word vectors.' % len(embeddings_index))

score 0 · Answer 1 · answered Dec 31 '21 at 19:35

0

Use this code to load your embedding index

  import pickle
  with open('glove_vectors', 'rb') as f:
     model = pickle.load(f)
     glove_words =  set(model.keys())

here you embedding index the model itself

answered Dec 31 '21 at 19:35

Umar Faruk

11
1
5

score 0 · Answer 2 · answered Aug 04 '22 at 07:37

Try following code ,it will resolve above issue:

def process_glove_line(line, dim):
    word = None
    embedding = None

    try:
        splitLine = line.split()
        word = " ".join(splitLine[:len(splitLine)-dim])
        embedding = np.array([float(val) for val in splitLine[-dim:]])
    except:
        print(line)

    return word, embedding

def load_glove_model(glove_filepath, dim):
    with open(glove_filepath, encoding="utf8" ) as f:
        content = f.readlines()
        model = {}
        for line in content:
            word, embedding = process_glove_line(line, dim)
            if embedding is not None:
                model[word] = embedding
        return model

embeddings_index= load_glove_model("glove.840B.300d.txt", 300)

Could you explain the changes you made and why they fix the problem? Thanks! :) — Aaron Meese, Aug 10 '22 at 00:15

Achin Stark · Answer 3 · 2020-05-11T13:20:43.417

-1

I think this will help you

f = open(r'C:\Users\15084\Downloads\glove.840B.300d\glove.840B.300d.txt',errors ='ignore',encoding='utf-8','r')

edited May 11 '20 at 13:20

answered May 10 '20 at 21:55

Achin Stark

9
4

the syntax is not correct and it returns a syntax error – VishwaV May 10 '20 at 22:14

ValueError: could not convert string to float: '.' during glove embedding

3 Answers3