I am attempting to bulk index a generator using the parallel bulk method from elasticsearch helpers in python, however it seems that this method doesn't perform anything. If I use the regular bulk method, the ingestion in to elasticsearch runs just fine. I have looked this problem up and came across with this solution : https://discuss.elastic.co/t/helpers-parallel-bulk-in-python-not-working/39498 which I tried (expects the generator to be consumed) but it seems it still doesn't function. No error is outputted, and the iterator is not being consumed, this is my code:
@staticmethod
def fetch_rows(cursor):
frame = cursor.fetchone()
while frame is not None:
yield frame
frame = cursor.fetchone()
@staticmethod
def __generate_field(body):
"""
Takes an action and creates an iterator element json join body
:param body: adds json body to generator
:return: item iterator
"""
for item in body:
yield item
def json_for_bulk_body_sql_list(self, body, index_name: str, name_of_docs: str):
"""
:param body: List that will be made as a generator
:param index_name : name of the index based on location
:param name_of_docs : name of the docs that you want of in the index
:return: Structured JSON file for bulking
"""
# if not isinstance(body, list):
# raise TypeError('Body must be a list')
if not isinstance(index_name, str):
raise TypeError('index must be a string')
structured_json_body = ({
'_op_type': 'index',
'_index': index_name, # index name Twitter
'_type': name_of_docs, # type is tweet
'_id': doc['tweet_id'], # id of the tweet
'_source': doc
} for doc in self.__generate_field(body))
return structured_json_body
json_results = (dict(zip(column_names, row)) for row in self.fetch_rows(cursor))
actions = (self.json_for_bulk_body_sql_list(json_results, index_name=index_, name_of_docs=doc_name))
for success, info in self.bulk_es_parallel(actions=actions):
if not success:
print('Doc failed: '.upper(), info)
else:
ingested += 1
I am doing the same exact thing that the example from the solution url is saying, but it still no ingestion into elastic search. Cant quite figure out why even after i debugged .
thank you so much !