I have a large matrix (30000 x 500), a column represents hourly data for the next 3 years, each column is a different scenario, i.e. I have 500 scenarios of prices where each cell in a row has the same timestamp.
I need to aggregate this over the time axis, so if daily i need to make a matrix of (30000/nrdays x 500), if monthly (30000/nrmonths x 500) and obviously also keep the right dates.
In matlab I created an index with a unique number for each day or each month and then looped over the columns using this:
accumarray(idx,price(:,i),[numel(unique(idx)) 1], @mean)
If I want to do this in c# what is the best way?
below is what I have so far:
public class matrixwihtdates
{
public DateTime dats;
public ILArray<double> nums;
}
public class endres
{
public string year;
public string month;
public string day;
public ILArray<double> nums;
}
public static List<endres> aggrmatrix(ILArray<double> origmatrix, DateTime std, DateTime edd)
{
var aggrmatr = new List<matrixwihtdates>();
for (int i = 0; i < origmatrix.Length; i++)
{
aggrmatr.Add(new matrixwihtdates
{
dats = std.AddHours(i),
nums = origmatrix[i, "full"],
});
}
return aggrmatr.GroupBy(a => new { yr = a.dats.Year, mt = a.dats.Month })
.Select(g => new endres {
year = g.Key.yr.ToString(),
month = g.Key.mt.ToString(),
nums = ILMath.mean(g.Select(a => a.nums).ToArray(),1) }).ToList();
}
The key problem is that I don't know how to average over each of the columns within the LINQ syntax so that a vector (1x500) is returned. Or should I not use LINQ? My last line above doesnt work.
UPDATE:
I have added a more imperative version without LINQ, this seems to work but a bit clumsy still.
public static List<ILArray<double>> aggrmatrixImp(ILArray<double> origmatrix, DateTime std)
{
List<ILArray<double>> aggrmatr = new List<ILArray<double>>();
ILArray<double> tempmatrix;
int startindicator = 0;
int endindicator = 0;
int month = std.Month;
for (int i = 0; i < origmatrix.Length; i++)
{
if (std.AddHours(i).Month != month)
{
endindicator = i - 1;
tempmatrix = origmatrix[ILMath.r(startindicator, endindicator), ILMath.r(0, ILMath.end)];
aggrmatr.Add(ILMath.mean(tempmatrix, 1));
startindicator = i;
month = std.AddHours(i).Month;
}
}
return aggrmatr;
}
I would still like to make the LINQ version work.
Update 2
I took Haymo's advise into account and here is another version that is twice as fast.
public static ILArray<double> aggrmatrixImp2(ILArray<double> origmatrix, DateTime firstdateinfile, DateTime std, DateTime edd)
{
int nrmonths = ((edd.Year - std.Year) * 12) + edd.Month - std.Month;
ILArray<double> aggrmatr = ILMath.zeros(nrmonths,500);
int startindicator = std.Date.Subtract(firstdateinfile.Date).Duration().Days*24;
int endindicator = 0;
DateTime tempdate = std.AddMonths(1);
tempdate = new DateTime(tempdate.Year, tempdate.Month, 1);
for (int i = 0; i < nrmonths; i++)
{
endindicator = tempdate.Date.Subtract(std.Date).Duration().Days * 24-1;
aggrmatr[i, ILMath.full] = ILMath.mean(origmatrix[ILMath.r(startindicator, endindicator), ILMath.full], 1);
tempdate = tempdate.AddMonths(1);
startindicator = endindicator+1;
}
return aggrmatr;
}
I do not have a working LINQ version but I doubt it will be faster.