My answer contains a program that I believe is better than the one proposed in Weeble's answer but first I would like to demonstrate how the Join
method works and talk about problems I see in your approach.
As you can see here https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.join?view=netframework-4.8
the Join
method
Correlates the elements of two sequences based on matching keys.
If the keys don't match then elements from both collections are not included. For example, remove your Equals
and GetHashCode
methods and try this code:
var first = new List<Foo> { new Foo(1, 1) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3) };
//This is your original code that returns no results
var result = second.Join(first, s => s, f => f, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => s, f => f, (f, s) => new { f, s }).ToList();
//This code is mine and it returns in both calls of the Join method one element in the resulting collection; the element contains two instances of Foo (1,1) - f and s
result = second.Join(first, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
But if you set your original data input that contains null
with my code:
var first = new List<Foo> { new Foo(1, null) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3) };
var result = second.Join(first, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
the result variable will be empty in both cases since the key { 1, null
} doesn't match any other key, i.e. { 1, 1 }, { 1, 2 }, { 1, 3 }.
Now returning to your question. I would suggest you reconsider your entire approach in cases like this and here is why. Let us imagine that your implementation of the Equals
and GetHashCode
methods worked as you expected and you even didn't post your question. Then your solution creates the following outcomes, as I see it:
To understand how your code calculates its output the user of your code has to have access to the code of the Foo
type and spend time reviewing your implementation of the Equals
and GetHashCode
methods (or reading documentation).
With such implementation of the Equals
and GetHashCode
methods, you are trying to change the expected behavior of the Join
method. The user may expect that the first element of the first collection Foo
(1, null) will not be considered equal to the first element of the second collection Foo
(1, 1).
Let us imagine that you have multiple classes to join, each is written by some individual, and each class has its own logic in the Equals
and GetHashCode
methods. To figure out how actually your joining works with each type the user instead of looking into a joining method implementation only once would need to check the source code of all those classes trying to understand how each type handles its own comparison facing different variations of things like this with magic numbers (taken from your code):
public override int GetHashCode()
{
var hashCode = 806340729;
hashCode = hashCode * -1521134295 + Id.GetHashCode();
return hashCode;
}
It may don't seem a big problem but imagine you are a new person on the
the project, you have a lot of classes with logic like this and limited time
to complete your task, e.g. you have an urgent change request, huge sets
of data input, and no unit tests.
If someone inherites from your class Foo and put an instance of Foo1 to the collection among with Foo instances:
public class Foo1 : Foo
{
public Foo1(int id, int? nullableId) : base (id, nullableId)
{
Id = id;
NullableId = nullableId;
}
public override bool Equals(object obj)
{
var otherFoo1 = (Foo1)obj;
return Id == otherFoo1.Id;
}
public override int GetHashCode()
{
var hashCode = 806340729;
hashCode = hashCode * -1521134295 + Id.GetHashCode();
return hashCode;
}
}
var first = new List<Foo> { new Foo1(1, 1) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3)};
var result = second.Join(first, s => s, f => f, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => s, f => f, (f, s) => new { f, s }).ToList();
then you have here a run-time exception in the Equals
method of the type Foo1:
System.InvalidCastException, Message=Unable to cast object of type
'ConsoleApp1.Foo' to type 'ConsoleApp1.Foo1'. With the same input data, my code
would work fine in this situation:
var result = second.Join(first, s => s.Id, f => f.Id, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => s.Id, f => f.Id, (f, s) => new { f, s }).ToList();
With your implementation of the Equals
and GetHashCode
methods when someone modifies the joining code like this:
var result = second.Join(first, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
then your logic in the Equals
and GetHashCode
methods will be ignored and
you will have a different result.
In my opinion, this approach (with overriding Equals
and GetHashCode
methods) may be a source of multiple bugs. I think it is better when your code performing joining has an implementation that can be understood without any extra information, the implementation of the logic is concentrated within one method, the implementation is clear, predictable, maintainable, and it is simple to understand.
Please also note that with your input data:
var first = new List<Foo> { new Foo(1, null) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3) };
the code in the Weeble's answer generates the following output:
Foo(1, 1)
Foo(1, 2)
Foo(1, 3)
while as far as I understand you asked for an implementation that with the input produces output that looks like this:
Foo(1, null), Foo(1, 1)
Foo(1, null), Foo(1, 2)
Foo(1, null), Foo(1, 3)
Please consider updating your solution with my code since it produces a result in the format you asked for, my code is easier to understand, and it has other advantages as you can see:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp40
{
public class Foo
{
public int Id { get; set; }
public int? NullableId { get; set; }
public Foo(int id, int? nullableId)
{
Id = id;
NullableId = nullableId;
}
public override string ToString() => $"Foo({Id}, {NullableId?.ToString() ?? "null"})";
}
class Program
{
static void Main(string[] args)
{
var first = new List<Foo> { new Foo(1, null), new Foo(1, 5), new Foo(2, 3), new Foo(6, 2) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3), new Foo(2, null) };
var result = second.Join(first, s=>s.Id, f=>f.Id, (f, s) => new { f, s })
.Where(o => !((o.f.NullableId != null && o.s.NullableId != null) &&
(o.f.NullableId != o.s.NullableId)));
foreach (var o in result) {
Console.WriteLine(o.f + ", " + o.s);
}
Console.ReadLine();
}
}
}
Output:
Foo(1, 1), Foo(1, null)
Foo(1, 2), Foo(1, null)
Foo(1, 3), Foo(1, null)
Foo(2, null), Foo(2, 3)