I understand how to programmatically receive the output, as well as how to run a MRJob job. This is clearly explained here. However I'm struggling to understand how to pass a list of dictionaries or any variables from another file into a MrJob job. Instead of having an input file the same way I might have "words.txt", I would instead like to pass said "words" as a variable (type list) containing those words.
To be more specific. Assume I have said list:
mylist = [
{"name": "Kayer", "Job": "Programmer"},
{"name": "Angela", "Job": "Designer"},
{"name": "Eve", "Job": "Programmer"},
{"name": "Robert", "Job": "Programmer"},
]
And I wanted to run a MrJob job which take said list, and would return me (for example) the number of people who's job is to be a programmer. How would I go about it?
Lastly, for the design of the system, I may not temporarily store the list into a text file or any file.
As I currently understand, I cannot run the following code within the same class and/or file of the job I'm trying to run:
mr_job = MRWordCounter(args=['-r', 'emr'])
with mr_job.make_runner() as runner:
runner.run()
for key, value in mr_job.parse_output(runner.cat_output()):
... # do something with the parsed output
Therefore, I set it up with a different file and then I could not figure out how to send the data over to my MrJob... I don't even understand if there is a way for me to pass the data over to MrJob in the first place.