How to push data to elasticsearch from dataframe in python

Question

I am trying to use elastic search libraries like pyelasticsearch and elasticsearch I am not getting any method where i can pass dataframe and that method will load data frame data to elastic search.

I am trying a this code:

for i,df in enumerate(csvfile): 
print (i)
records=df.where(pd.notnull(df), None).T.to_dict()
list_records=[records[it] for it in records]
print(list_records)
try :
    es.bulk_index("call_demo_data1","tweet",list_records)
except :
    print ("error!, skiping some tweets sorry")
    pass

where csvfile is my dataframe where my all data is present. but I am getting following error

'str' object has no attribute 'where'

I have used recommendation in comments

Now that problem is solved I am getting this error while bulk loading

I am using above method to load data elastic search I was facing problem so here is the link to the question I posted earlier

Here is the code I am using now :

records= csvfile.T.to_dict()
list_records=[records[it] for it in records]
#print(list_records)
es.bulk_index("call_demo_data1","tweet",list_records)

Error I am getting is :

too many values to unpack (expected 2)

this error is coming while bulk indexing. csvfile in above code is a data frame. I am using this liabrary pyelasticsearch

This is the error traceback

I don't see anywhere where you actually make a DataFrame...? The error comes long before trying to push data to elasticsearch — roganjosh, Aug 15 '17 at 19:49
@roganjosh `records=df.where(pd.notnull(df), None).T.to_dict()` this lines comtains where — PriyalChaudhari, Aug 15 '17 at 19:50
I have created dataframe named csvfile in my code above and that dataframe contains record when i do `csvfile.head()` it is printing data — PriyalChaudhari, Aug 15 '17 at 19:52
No it doesn't. Unless you explicitly create the dataframe, calling something `df` will not automatically make it as such. `df` in your case it just a string being read from a csv file. — roganjosh, Aug 15 '17 at 19:52
Ok, so if you have a dataframe already, why iterate through it like this? — roganjosh, Aug 15 '17 at 19:53
ok. so what i need to do in this case. I am stuck in this approach. — PriyalChaudhari, Aug 15 '17 at 19:54
I am new to python I followed this approch from one of github solution from here `https://gist.github.com/clemsos/8668698` — PriyalChaudhari, Aug 15 '17 at 19:55
It would seem that `enumerate()` destroys your dataframe in some way and you end up back with strings. You cannot call the `where` method anymore. — roganjosh, Aug 15 '17 at 19:55
I'm guessing here; get rid of the for loop. `records = csvfile.dropna().T.to_dict()` — roganjosh, Aug 15 '17 at 19:58
i tried using ur recommendation like this `records= csvfile.T.to_dict() list_records=[records[it] for it in records] #print(list_records) es.bulk_index("call_demo_data1","tweet",list_records) print ("done in %.3fs"%(time()-t0))` i am getting following error in es.bulk_index error is `too many values to unpack (expected 2) ` — PriyalChaudhari, Aug 15 '17 at 20:12
You need to edit all this into the question and provide the full traceback. Perhaps some progress has been made here. There's no way any of us can make sense of that comment. — roganjosh, Aug 15 '17 at 20:14
That's not the traceback, that's the error. Please include the whole traceback. — roganjosh, Aug 15 '17 at 21:07

How to push data to elasticsearch from dataframe in python

0 Answers0