0

I've been trying to use the javascript google-trends-api from npmjs.com to automate retrieval of search trend data. I'm successful in getting the raw JSON output from my queries, but I can't seem to successfully parse them into a form -- preferably CSV or tab-delimited text -- that I can import into other software (Stata) for analysis. I've tried json2csv to parse the results, but this returns nothing resembling a spreadsheet-style file.

Here's an example of some representative query results (abridged for length):

"{\"default\":{\"geoMapData\":[{\"geoCode\":\"798\",\"geoName\":\"Glendive MT\",\"value\":[100],\"formattedValue\":[\"100\"],\"maxValueIndex\":0,\"hasData\":[true]},{\"geoCode\":\"552\",\"geoName\":\"Presque Isle ME\",\"value\":[49],\"formattedValue\":[\"49\"],\"maxValueIndex\":0,\"hasData\":[true]},{\"geoCode\":\"710\",\"geoName\":\"Hattiesburg-Laurel MS\",\"value\":[11],\"formattedValue\":[\"11\"],\"maxValueIndex\":0,\"hasData\":[true]},{\"geoCode\":\"766\",\"geoName\":\"Helena MT\",\"value\":[0],\"formattedValue\":[\"\"],\"maxValueIndex\":0,\"hasData\":[false]}]}}"

Upon running json2csv -i "results.json" and saving the output using fs.outputJson, I get errors unless I strip the outermost quotes and all backslashes. But even after doing that, this is what I get back:

"default.geoMapData" "[{""geoCode"":""798"",""geoName"":""Glendive MT"",""value"":[100],""formattedValue"":[""100""],""maxValueIndex"":0,""hasData"":[true]},{""geoCode"":""552"",""geoName"":""Presque Isle ME"",""value"":[49],""formattedValue"":[""49""],""maxValueIndex"":0,""hasData"":[true]},{""geoCode"":""710"",""geoName"":""Hattiesburg-Laurel MS"",""value"":[11],""formattedValue"":[""11""],""maxValueIndex"":0,""hasData"":[true]},{""geoCode"":""766"",""geoName"":""Helena MT"",""value"":[0],""formattedValue"":[""""],""maxValueIndex"":0,""hasData"":[false]}]"

I'd appreciate any guidance in how to translate my query output into a sensible CSV (or the like). I should add that I'm a complete beginner at this stuff, and apologize if the answer is obvious!

Eric

user3486184
  • 2,147
  • 3
  • 26
  • 28

1 Answers1

0

To convert JSON data to CSV, you can use jq command line utility.

Provided your sample data, you can do:

echo "{\"default\":{\"geoMapData\":[{\"geoCode\":\"798\",\"geoName\":\"Glendive MT\",\"value\":[100],\"formattedValue\":[\"100\"],\"maxValueIndex\":0,\"hasData\":[true]},{\"geoCode\":\"552\",\"geoName\":\"Presque Isle ME\",\"value\":[49],\"formattedValue\":[\"49\"],\"maxValueIndex\":0,\"hasData\":[true]},{\"geoCode\":\"710\",\"geoName\":\"Hattiesburg-Laurel MS\",\"value\":[11],\"formattedValue\":[\"11\"],\"maxValueIndex\":0,\"hasData\":[true]},{\"geoCode\":\"766\",\"geoName\":\"Helena MT\",\"value\":[0],\"formattedValue\":[\"\"],\"maxValueIndex\":0,\"hasData\":[false]}]}}" | \
jq -r '.default.geoMapData | map([.geoCode,.geoName,(.value[]|tostring),.formattedValue[],(.maxValueIndex|tostring),(.hasData[]|tostring)] | join(",")) | join("\n")'
798,Glendive MT,100,100,0,true
552,Presque Isle ME,49,49,0,true
710,Hattiesburg-Laurel MS,11,11,0,true
766,Helena MT,0,,0,false
oliv
  • 12,690
  • 25
  • 45
  • Thanks oliv -- this works for me! Since I'm interested in converting files, I changed the code to: `cat results.json | jq -r '.default.geoMapData | map([.geoCode,.geoName,(.value[]|tostring),.formattedValue[],(.maxValueIndex|tostring),(.hasData[]|tostring)] | join(",")) | join("\n")' > parsed.csv` But perhaps there's a more efficient way? – Eric Knowles Oct 03 '18 at 12:57
  • You probably mean something like this: `jq '...' result.json > parsed.csv`. BTW don't forget to look at [this](https://stackoverflow.com/help/someone-answers). – oliv Oct 03 '18 at 13:00
  • At first this worked perfectly ... But now I'm getting an error. (I have no idea what changed, but something did!) Now, when I use `jq '.default.geoMapData | map([.geoCode,.geoName,(.value[]|tostring),.formattedValue[],(.maxValueIndex|tostring),(.hasData[]|tostring)] | join(",")) | join("\n")' result.json > parsed.csv`, I get the error `jq: error (at result.json:0): Cannot index string with string "default"`. When I use the `echo` version you first suggested, it all goes well. But when I read from the file, I get that error! The echo text and the contents of result.json are identical. – Eric Knowles Oct 03 '18 at 17:35
  • The problem is the backslash quoted string that is interpreted by `echo` but not by `cat`. To solve this try `sed 's/^"//;s/\\//g;s/"$//' result.json | jq '...' > parse.csv` – oliv Oct 03 '18 at 19:00
  • Obviously this would be better to know how the `result.json` is created and solve it there. – oliv Oct 03 '18 at 19:01
  • It turns out I improperly exported the data in the first place: I saved them using `fs.outputJson` despite the fact that they were _already_ in JSON form. The double-quotes were interpreted as literals, hence the backslashes. Now that I'm saving the data using `fs.outputFile`, the `jq` step works fine. Thanks again! – Eric Knowles Oct 03 '18 at 19:32