I am working on a method that takes in two datatables and a list of primary key column names and gives back the matches. I do not have any other info about the tables.
I have searched the site for a solution to this problem and have found some answers, but none have given me a fast enough solution.
Based on results from stackoverflow I now have this:
var matches =
(from rowA in tableA.AsEnumerable()
from rowB in tableB.AsEnumerable()
where primaryKeyColumnNames.All(column => rowA[column].ToString() == rowB[column].ToString())
select new { rowA, rowB });
The problem is this is REALLY slow. It takes 4 minutes for two tables of 8000 rows each. Before I came to stackoverflow I was actually iterating through the columns and rows it took 2 minutes. (so this is actually slower than what I had) 2-4 minutes doesn't seem so bad until I hit the table with 350,000 rows. It takes days. I need to find a better solution.
Can anyone think of a way for this be faster?
Edit: Per a suggestion from tinstaafl this is now my code.
var matches = tableA.Rows.Cast<DataRow>().Select(rowA => new
{
rowA,
rowB = tableB.Rows.Find(rowA.ItemArray.Where((x, y) =>
primaryKeyColumnNames.Contains(tableA.Columns[y].ColumnName,
StringComparer.InvariantCultureIgnoreCase)).ToArray())
})
.Where(x => x.rowB != null);