I'm wondering whether it is possible to leverage scala-native for performing large in-memory jobs.
For instance, imagine you have a spark job that needs 150GB of RAM so you'd have to run 5x30GB executors in a spark cluster since JVM garbage collectors wouldn't catch up with heap bigger than that.
Imagine that 99% of the data being processed are Strings
in collections.
Do you think that scala-native would help here? I mean, as an alternative to Spark?
How does it treat String
? Does it also have this overhead because jvm treats it as class?
What are the memory ("Heap") GC limits as the classic 30GB in case of JVM? Would I also end up with a limit like 30GB?
Or is this generally a bad idea? To use scala-native for in-memory data processing. My guess is that scala-offheap is better way to go.