I have a MySQL table with over 100 million rows. This table is a production table from which a lot of read requests are served. I need to fetch lets say a million rows from this table, process the rows in Node.js script and then store the data to Elasticsearch.
I have done this a lot with MongoDB without facing any issues. I start a read stream and I keep on pausing the stream after reading every 1000 rows, and once the downstream process is complete for this set of 1000 rows, I resume the stream and keep on doing till all the rows are processed. I have not faced any performance challenges with MongoDB since the find
query returns a cursor which fetches result in batches. So, irrespective of how big my input is, it does not cause any issues.
Now, I don't know how the streaming queries in MySQL work under the hood. Will my approach work for MySQL or will I have to execute the query again and again like first select query fetches those rows where id < 1000
and the next query fetches result where id between 1000 and 2000
and so on. Anybody worked on similar problem before. I found a similar question on Stackoverflow but there was no answer.