1

I'm using the following snippet to catch all debts belonging to a customer belonged to an account number which is unique. However, I get duplicates of the account numbers even I have chosen it as the key. I'm too amazed.

var records      = myList;
var groupedList  = records?
                      .GroupBy(x => x.AccountNumber.Where(char.IsDigit).ToArray())
                      .Select(e => new {
                             AccountNumber = new string(e.Key), 
                             TotalDebt = e.Sum(x => x.TotalDebt)
                            }
                       );

q

  • 2
    It seems, that `ToArray()` creates a new instance every time, which is being threated as a separate key in `GroupBy` – Pavel Anikhouski Apr 20 '20 at 14:25
  • @PavelAnikhouski can you tell me how can I try without ToArray()? – Soner from The Ottoman Empire Apr 20 '20 at 14:29
  • you can use `AccountNumber` property itself, which is `string`, I suppose – Pavel Anikhouski Apr 20 '20 at 14:30
  • @PavelAnikhouski actually since it comes from a webservice, sometimes it has tailing or head non-digit characters. – Soner from The Ottoman Empire Apr 20 '20 at 14:31
  • 3
    Does this answer your question? [Linq - group by using the elements inside an array property](https://stackoverflow.com/questions/31724512/linq-group-by-using-the-elements-inside-an-array-property) – Drag and Drop Apr 20 '20 at 14:40
  • For explanation when group by on an array , it only compare the reference. We need a real comparaison . Either we specify a comparer or convert char array back to string `.GroupBy(x => new string(x.AccountNumber.Where(char.IsDigit).ToArray()))` – Drag and Drop Apr 20 '20 at 14:42
  • @PavelAnikhouski if you write it as an answer additional to how I can succeed in combining of the only digit characters as string, I will accept it sir. By the way, _which is being threated as a separate key in GroupBy_, could you tell me its reason? – Soner from The Ottoman Empire Apr 20 '20 at 14:45
  • @snr Look at the following code https://dotnetfiddle.net/syzxdJ. Both array has the same value but different variable means they are not equals as we compare the reference. To really compare the containt of the array we need something like `SequenceEqual` – Drag and Drop Apr 20 '20 at 14:46
  • @DragandDrop could you tell me its detailed and under the hood reason, please? I.e. reference type comparison. – Soner from The Ottoman Empire Apr 20 '20 at 14:48
  • @snr, for each records `x.AccountNumber.Where(char.IsDigit).ToArray()` will filter the AccountNumber and create a **new** char array. By default reference type are compare by reference (this should be in the language spec in the type section). Then the groupy by take all those different array and start the comparaison – Drag and Drop Apr 20 '20 at 14:55
  • https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/statements-expressions-operators/equality-comparisons – Drag and Drop Apr 20 '20 at 15:00
  • @DragandDrop You would need an `IEqualityComparer` to go with `SequenceEqual`. – RoadRunner Apr 20 '20 at 15:01
  • 1
    @RoadRunner, yes. We need one. That's why I choose that dupe target. It shows the implementation with a sequence equals. – Drag and Drop Apr 20 '20 at 15:02

1 Answers1

0

Apparently you expect your AccountNumbers to contain characters that are not digits, and you don't want to consider them. You want AccountNumbers that contain only digits.

To prevent to remove the digits more than once per AccountNumber, my advice would be to use an extra Select for this.

After removing the non-characters, consider to convert the AccountNumber to an int, this would speed up your processing considerably. If you don't want to convert it to int, remove the Parse part.

Another advice for this: Never let your LINQ statements return NULL. The result of your LINQ statements (as long as it returns IEnumerable<...>) represents a sequence of similar items. If there are no items in the sequence, return an empty sequence. This has the advantage that you don't have to check if the input is null.

List<MyRecord> myList = ...
var records = myList ?? Enumerable.Empty<MyRecord>();
ver groupedList  = records.Select(record => new
{
    // Remove the non-digits, and parse to Int32 or Int64
    // I'm certain that this parses, because it is digits only (or out-of-range?)
    AccountNumber = Int32.Parse(record.AccountNumber.Where(c => Char.IsDigit(c))),

    Debt = record.TotalDebt,  // are you sure this property is also called TotalDebt?
})

// Now the GroupBy is easy and efficient:
.GroupBy(record => record.AccountNumber,
(accountNumber, recordsWithThisAccountNumber) => new
{
    AccountNumber = accountNumber,
    TotalDebt = recordsWithThisAccountNumber.Sum()
});

Now you can be certain that groupedList is a non-null sequence. If myList was null, then groupedList is an empty sequence.

Harald Coppoolse
  • 28,834
  • 7
  • 67
  • 116