I have a set of pricing data for a lot of stocks (around 1.1 million lines).
I'm having trouble parsing all of this data in memory so I'd like to split it by stock symbol into individual files and only import the data as it is needed.
From:
stockprices.json
To:
AAPL.json
ACN.json
...
etc.
stockprices.json has this structure currently:
[{
"date": "2016-03-22 00:00:00",
"symbol": "ACN",
"open": "121.029999",
"close": "121.470001",
"low": "120.720001",
"high": "122.910004",
"volume": "711400.0"
},
{
"date": "2016-03-23 00:00:00",
"symbol": "AAPL",
"open": "121.470001",
"close": "119.379997",
"low": "119.099998",
"high": "121.470001",
"volume": "444200.0"
},
{
"date": "2016-03-24 00:00:00",
"symbol": "AAPL",
"open": "118.889999",
"close": "119.410004",
"low": "117.639999",
"high": "119.440002",
"volume": "534100.0"
},
...{}....]
I believe that jq is the right tool for the job but I'm having trouble understanding it.
How would I take the data above and use jq to split it by the symbol field?
For example I'd like to end up with:
AAPL.json:
[{
"date": "2016-03-23 00:00:00",
"symbol": "AAPL",
"open": "121.470001",
"close": "119.379997",
"low": "119.099998",
"high": "121.470001",
"volume": "444200.0"
},
{
"date": "2016-03-24 00:00:00",
"symbol": "AAPL",
"open": "118.889999",
"close": "119.410004",
"low": "117.639999",
"high": "119.440002",
"volume": "534100.0"
}]
and ACN.json:
[{
"date": "2016-03-22 00:00:00",
"symbol": "ACN",
"open": "121.029999",
"close": "121.470001",
"low": "120.720001",
"high": "122.910004",
"volume": "711400.0"
},
{
"date": "2016-03-22 00:00:00",
"symbol": "ACN",
"open": "121.029999",
"close": "121.470001",
"low": "120.720001",
"high": "122.910004",
"volume": "711400.0"
}
]