I am using Spark to perform some operations on my data. I need use a auxiliary dictionary to help my data operations.
streamData = sc.textFile("path/to/stream")
dict = sc.textFile("path/to/static/file")
//some logic like:
//if(streamData["field"] exists in dict)
// do something
My question is: is the dict in memory all the time or does it need to be loaded and unloaded each time Spark is working on a batch?
Thanks