I have a streaming pipeline hooked up to pub/sub that publishes filenames of GCS files. From there I want to read each file and parse out the events on each line (the events are what I ultimately want to process).
Can I use TextIO? Can you use it in a streaming pipeline when the filename is defined during execution (as opposed to using TextIO as a source and the fileName(s) are known at construction). If not I'm thinking of doing something like the following:
Get the topic from pub/sub ParDo to read each file and get the lines Process the lines of the file...
Could I use the FileBasedReader or something similar in this case to read the files? The files aren't too big so I wouldn't need to parallelize the reading of a single file, but I would need to read a lot of files.