I have a case where I am getting a large CSV from an external source as a ReadStream
I need to consume the ReadStream
in 2 places
- Upload to S3
- Read the CSV to find the latest date in the CSV and upload to db
My solution is working for small files, like 10kb but for large files (several megabytes) it does not and the uploading does not start (and the the CSV read either)
The uploading is done 1st and then reading the CSV (and getting the date)
I am attempting to "clone" the ReadStream
like this:
const clonedStream1 = responseStream.pipe(new PassThrough());
const clonedStream2 = responseStream.pipe(new PassThrough());
I have also tested with cloneable-readable
package but without success
What is the reason this does not work for large files, why is it getting stuck ? I am most likely missing some vital information how these streams are working
I have tested and the uploading and reading the CSV works independently for large files
In the S3 upload I am using the multipart upload and in the CSV read I am using csv-parse
lib for getting the date
This is implemented in NodeJS
Any ideas ?