Premise: I already noticed this similar question, but it doesn't exactly cover what I would like to understand.
Problem
As part of a project to build a scalable and reliable search solution, I am exploring different ways to bulk load data into OpenSearch.
I have researched OpenSearch Bulk API and AWS Kinesis Firehose as possible options, but I am still unsure about which approach would be the best fit.
As I understand,
- OpenSearch Bulk API allows for indexing multiple documents in a single API call, which can reduce overhead and improve performance when dealing with large volumes of data
- AWS Kinesis Firehose is fully managed and can handle large volumes of data, buffer and compress it, and then deliver it reliably to an OpenSearch index
Question
Given these observations, what are the specific advantages and trade-offs of using OpenSearch Bulk API versus AWS Kinesis Firehose with OpenSearch index as destination?
Are there any performance and scalability considerations that should be taken into account when choosing between these options? Let's exclude costs from the equation.
Would appreciate any insights or recommendations from the community. Thanks in advance.