0

I know how to do this in sql, but for c#..I cannot figure out how to make comparison of two datatables.

Let's say:

1st datatable:

Name  |  Balance | Description
Smith |   1200   | Smith owes 600
Jordan|   4000   | Hi Jordan
Brooks|   5000   | I like my cat
Navaro|   6000   | description here
Gates |   9010   | omg

2nd datatable:

Name  |  Balance | Description
Smith |   1600   | Smith owes 600
Jordan|   4200   | I'M JORDAN
Clay  |   9000   | Test description
Brooks|   5000   | I like my cat

I want to dump results of comparison to a simple html table.

Soooo...result should be like this:
enter image description here

Basically what I need is:

  1. Show columns that are different and show data

  2. Do not show record if all columns are identical

  3. Show records that only exist in first datatable (just names)

  4. Show records that only exist in second datatable (just names)

In sql, you could do something like merge, then pivot.

But in C#, my findings: I can use except, or intersect, but it returns one dattable. Is there any formatting options for an except\intersection functions?

I'm looking for an advice on how to achieve this the best way. (there's about 100 columns in each datatable). All should be compared by name.

Community
  • 1
  • 1
user194076
  • 8,787
  • 23
  • 94
  • 154

1 Answers1

3

Here is the code you need to have in your .cs file...

(I've only created those two empty classes to avoid having in code Dictionary<object, Dictionary<string, Tuple<object, object>>> but you can replace that if you prefer)

protected class Differences : Dictionary<object, RowDifferences>
{
}

protected class RowDifferences : Dictionary<string, Tuple<object, object>>
{
}

protected Differences GetDifferences(DataTable table1,
                                     DataTable table2,
                                     out IEnumerable<object> onlyIn1,
                                     out IEnumerable<object> onlyIn2)
{
    var arr1 = new DataRow[table1.Rows.Count];
    var arr2 = new DataRow[table2.Rows.Count];

    table1.Rows.CopyTo(arr1, 0);
    table2.Rows.CopyTo(arr2, 0);

    onlyIn1 = arr1.Where(x1 => arr2.All(x2 => x1[0] != x2[0])).Select(dr => dr[0]);
    onlyIn2 = arr2.Where(x1 => arr1.All(x2 => x1[0] != x2[0])).Select(dr => dr[0]);

    var differences = new Differences();

    foreach (var x1 in arr1)
    {
        foreach (var x2 in arr2)
        {
            if (x1[0] == x2[0])
            {
                var rowDifferences = new RowDifferences();

                for (var i = 1; i < x1.ItemArray.Length; i++)
                {
                    if (x1.ItemArray[i] != x2.ItemArray[i])
                    {
                        rowDifferences.Add(table1.Columns[i].ColumnName,
                                           new Tuple<object, object>(x1.ItemArray[i], x2.ItemArray[i]));
                    }
                }

                differences.Add(x1[0], rowDifferences);
            }
        }
    }

    return differences;
}

protected void GenerateTables(out DataTable table1, out DataTable table2)
{
    table1 = new DataTable();
    table2 = new DataTable();

    table1.Columns.Add("Name");
    table1.Columns.Add("Balance");
    table1.Columns.Add("Description");

    table2.Columns.Add("Name");
    table2.Columns.Add("Balance");
    table2.Columns.Add("Description");

    table1.Rows.Add("Smith", 1200, "Smith owes 600");
    table1.Rows.Add("Jordan", 4000, "Hi Jordan");
    table1.Rows.Add("Brooks", 5000, "I like my cat");
    table1.Rows.Add("Navaro", 6000, "description here");
    table1.Rows.Add("Gates", 9010, "omg");

    table2.Rows.Add("Smith", 1600, "Smith owes 600");
    table2.Rows.Add("Jordan", 4200, "I'M JORDAN");
    table2.Rows.Add("Clay", 9000, "Test description");
    table2.Rows.Add("Brooks", 5000, "I like my cat");
}

And here is an example of how to build the table out of it in the .aspx file:

<%
    DataTable table1, table2;
    GenerateTables(out table1, out table2);

    IEnumerable<object> onlyIn1, onlyIn2;
    var differences = GetDifferences(table1, table2, out onlyIn1, out onlyIn2);
%>

<table>
    <thead>
        <tr>
            <th>Name</th> 
            <th>RecordName</th> 
            <th>1st Datatable</th> 
            <th>2nd Datatable</th> 
        </tr>
    </thead>
    <tbody>
        <%
            foreach (var difference in differences)
            {
        %>
        <tr>
            <td><%=difference.Key%></td>
        </tr>
        <%
                foreach (var rowDifferences in difference.Value)
                {
        %>
        <tr>
            <td></td>
            <td><%=rowDifferences.Key%></td>
            <td><%=rowDifferences.Value.Item1%></td>
            <td><%=rowDifferences.Value.Item2%></td>
        </tr>
        <%
                }
            }
        %>
        <tr>
            <td>Only 1st datatable</td>
        </tr>
        <%
            foreach (var name in onlyIn1)
            {
        %>
        <tr>
            <td><%=name%></td>
        </tr>
        <%
            }
        %>
        <tr>
            <td>Only 2st datatable</td>
        </tr>
        <%
            foreach (var name in onlyIn2)
            {
        %>
        <tr>
            <td><%=name%></td>
        </tr>
        <%
            }
        %>
    </tbody>
</table>

Styling the table as you wish shouldn't be hard from here on.

So the main thing you have left is changing GenerateTables to some querying logic, or even in-line it inside GetDifferences.

The finding algorithm can probably be perfected. It is currently O(m * n * k) in the worst case scenario, m and n being the number of rows in table1 and table2 respectively, and k being the number of columns. I can already think of ways to improve it by much, but I'll leave those to you. This should get you started nice and good.

Just note that this algorithm assumes that the columns are equal between the two tables.

Let me know if there's anything unclear about the solution, and good luck!

SimpleVar
  • 14,044
  • 4
  • 38
  • 60