There are a couple of approaches to this. Either you can partition the data and then sum on the partitions, or you can roll the whole thing into a single method.
Since partitioning is based on the gaps between the Number
values you won't be able to work on unordered lists. Building the partition list on the fly isn't going to work if the list isn't ordered, so make sure you sort the list on the partition field before you start.
Partitioning
Once the lists is ordered (or if it was pre-ordered) you can partition. I use this kind of extension method fairly often for breaking up ordered sequences into useful blocks, like when I need to grab sequences of entries from a log file.
public static partial class Ext
{
public static IEnumerable<T[]> PartitionStream<T>(this IEnumerable<T> source, Func<T, T, bool> partitioner)
{
var partition = new List<T>();
T prev = default;
foreach (var next in source)
{
if (partition.Count > 0 && !partitioner(prev, next))
{
new { p = partition.ToArray(), prev, next }.Dump();
yield return partition.ToArray();
partition.Clear();
}
partition.Add(prev = next);
}
if (partition.Count > 0)
yield return partition.ToArray();
}
}
The partitioner
parameter compares two objects and returns true if they belong in the same partition. The extension method just collects all the members of the partition together and returns them as an array once it finds something for the next partition.
From there you can just do simple summing on the partition arrays:
var source = new (int n, int v)[] { (1,10),(2,12),(5,5),(6,9),(9,4),(10,3),(11,1) };
var maxDifference = 2;
var aggregate =
from part in source.PartitionStream((l, r) => (r.n - l.n) <= maxDifference)
let low = grp.Min(g => g.n)
let high = grp.Max(g => g.n)
select new { Ranges = $"{low}-{high}", Total = grp.Sum(g => g.v) };
This gives the same output as your example.
Stream Aggregation
The second option is both simpler and more efficient since it does barely any memory allocations. The downside - if you can call it that - is that it's a lot less generic.
Rather than partitioning and aggregating over the partitions, this just walks through the list and aggregates as it goes, spitting out results when the partitioning criteria is reached:
IEnumerable<(string Ranges, int Total)> GroupSum(IEnumerable<(int n, int v)> source, int maxDistance)
{
int low = int.MaxValue;
int high = 0;
int total = 0;
foreach (var (n, v) in source)
{
// check partition boundary
if (n < low || (n - high) > maxDistance)
{
if (n > low)
yield return ($"{low}-{high}", total);
low = high = n;
total = v;
}
else
{
high = n;
total += v;
}
}
if (total > 0)
yield return ($"{low}-{high}", total);
}
(Using ValueTuple
so I don't have to declare types.)
Output is the same here, but with a lot less going on in the background to slow it down. No allocated arrays, etc.