1

Reading a large text file (with 40M+ lines) and doing some operations on this list and writing the output to new file.

Ex: call a web service & use the response to do union or intersection with this list (repeat few hundred times this process)

what is the best way to implement this in a functional way using scala using cats or scala stream library(without having OOM issue)?

  1. Read data in chunks
  2. Do operations with current list (union or intersection)
  3. Write to a new file
vkt
  • 1,401
  • 2
  • 20
  • 46
  • https://github.com/functional-streams-for-scala/fs2-cats. Could help with #1 and #3. Readme example is somewhat close to what you are asking. How big is the web service query? I assume it also needs to be streamed? – RStrad Apr 13 '18 at 00:48
  • Thank you, I did understand the `scalaz-stream`.is there any resources for cat streams or examples for cat streams. for web service, the query can be handled in memory. – vkt Apr 13 '18 at 01:30
  • Sorry about that. I meant to paste this https://functional-streams-for-scala.github.io/fs2/. The 0.10 version of fs2 is all cats based. The example on that page reads in a file and writes it to disk using streams. – RStrad Apr 13 '18 at 12:55

0 Answers0