I currently have a list of words within a text file, all the words within the document are on a separate line. I also have imported nested json data into a pandas data frame.
Json data format looks something similar to this:
[
{
"year":"2019",
"category":"chemistry",
"laureates":[
{
"id":"976",
"motivation":"\"for the development of lithium-ion batteries\"",
"share":"3"
},
{
"id":"977",
"motivation":"\"for the development of lithium-ion batteries\"",
"share":"3"
}
]
},
{
"year":"2019",
"category":"economics",
"laureates":[
{
"id":"982",
"firstname":"Abhijit",
"surname":"Banerjee",
"motivation":"\"for their experimental approach to alleviating global poverty\"",
"share":"3"
},
I need to use the words within the text file to find out various frequencies within the json file for each of the categories (such as: chemistry). I am then asked to plot the multiple frequencies (1st most frequent word, 10th, 20th, 30th, 40th, 50th) using Matplotlib, for each of the subjects.
I am very confused as I'm not sure about the best way to go about this.