1

Hope you are having a nice day!

I am trying to import data from a python script into an elasticsearch index. The index must receive data every so often (10 min) and accumulate said data every time the python script finishes the process, of course, without forgetting the first data to can graph it in kibana.

I'm using eland to get data from elastic, do a procces (with pandas) and using pandas_to_eland (eland documentation here) to send information to elastic. The problen is index is forgetting the new incoming data.

Here is what I'm doing:

while(True):

  #Get data from elasticsearch
  eland_data = ed.DataFrame(es, "index_name") # (elasticsearch client, index)

  #To pandas 
  pandas_data = ed.eland_to_pandas(eland_data)

#=====================================================================
  #Some proccess with pandas_data that gives 1 dataframe row of data
#=====================================================================

  ed_df = ed.pandas_to_eland(pandas_data, #Processed data
                              es, #Elasticsearch client
                              "new_index_data, #Name the new index
                              es_if_exists="append", #append?
                              es_refresh=True) #Refresh index

  time.sleep(600) #10 min until next catch of data

Is there other methods to accumulate data in an elasticsearch index?

1 Answers1

0

Elasticsearch has a concept called pivot transforms that can continuously aggregate the data of an index (based on a unique key). It also puts that information into a new index and runs all within the Elasticsearch cluster. You can create those jobs through an API or click it together in Kibana through the UI.

Is that what you're after?

xeraa
  • 10,456
  • 3
  • 33
  • 66