I have an input folder with 1000's of files, each of them with 1000's of JSON records.
I created an Akka actor system to read these files and process them.
I'm reading the JSON from each file using:
JSONParser parser = new JSONParser();
Object object = parser.parse(new FileReader(file));
JSONArray jsonArray = (JSONArray)object;
Iterator<JSONObject> iterator = jsonArray.iterator();
while (iterator.hasNext()) {
JSONObject currentJsonObject = iterator.next();
// send json object to another actor for further processing
}
The initial design included a new "file reader" actor with the above code for each file in the folder.
It worked OK when I had only a few files in the folder.
When the number of files in the folder is large, the system is crashing with "OutOfMemory" exception. So it seems like all these "file reader" actors are trying to read all the files at the same time and load them to memory.
What will be a good approach to read these JSON files?
Akka Streams?
Only one "file reader" actor that reads them one by one?