I'm trying to work out the most efficient way to loop through some deeply nested data, find the average of the values and return a new hash with the data grouped by the date.
The raw data looks like this:
[
client_id: 2,
date: "2015-11-14",
txbps: {
"22"=>{
"43"=>17870.153846153848,
"44"=>15117.866666666667
}
},
client_id: 1,
date: "2015-11-14",
txbps: {
"22"=>{
"43"=>38113.846153846156,
"44"=>33032.0
}
},
client_id: 4,
date: "2015-11-14",
txbps: {
"22"=>{
"43"=>299960.0,
"44"=>334182.4
}
},
]
I have about 10,000,000 of these to loop through so I'm a little worried about performance.
The end result, needs to look like this. The vals need to be the average of the txbps:
[
{
date: "2015-11-14",
avg: 178730.153846153848
},
{
date: "2015-11-15",
avg: 123987.192873978987
},
{
date: "2015-11-16",
avg: 126335.982123876283
}
]
I've tried this to start:
results.map { |val| val["txbps"].values.map { |a| a.values.sum } }
But that's giving me this:
[[5211174.189281798, 25998.222222222223], [435932.442835184, 56051.555555555555], [5718452.806735582, 321299.55555555556]]
And I just can't figure out how to get it done. I can't find any good references online either.
I also tried to group by the date first:
res.map { |date, values| values.map { |client| client["txbps"].map { |tx,a| { date: date, client_id: client[':'], tx: (a.values.inject(:+) / a.size).to_i } } } }.flatten
[
{
: date=>"2015-11-14",
: client_id=>"2",
: tx=>306539
},
{
: date=>"2015-11-14",
: client_id=>"2",
: tx=>25998
},
{
: date=>"2015-11-14",
: client_id=>"2",
: tx=>25643
},
{
: date=>"2015-11-14",
: client_id=>"2",
: tx=>56051
},
{
: date=>"2015-11-14",
: client_id=>"1",
: tx=>336379
},
{
: date=>"2015-11-14",
: client_id=>"1",
: tx=>321299
}
]
If possible, how can I do this in a single run.
---- EDIT ----
Got a little bit further:
res.map { |a,b|
{
date: a[:date], val: a["txbps"].values.map { |k,v|
k.values.sum / k.size
}.first
}
}.
group_by { |el| el[:date] }.map { |date,list|
{
key: date, val: list.map { |elem| elem[:val] }.reduce(:+) / list.size
}
}
But that's epic - is there a faster, simpler way??