1

I've been working on a project for a while to parse a list of entries from a csv file and use that data to update a database.

For each entry I create a new user instance that I put in a collection. Now I want to iterate that collection and compare the user entry to the user from the database (if it exists). My question is, how can I compare that user (entry) object to the user (db) object, while returning a list with differences?

For example following classes generated from database:

public class User
{
    public int ID { get; set; }
    public string EmployeeNumber { get; set; }
    public string UserName { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public Nullable<int> OfficeID { get; set; }

    public virtual Office Office { get; set; }
}

public class Office
{
    public int ID { get; set; }
    public string Code { get; set; }

    public virtual ICollection<User> Users { get; set; }
}

To save some queries to the database, I only fill the properties that I can retrieve from the csv file, so the ID's (for example) are not available for the equality check.

Is there any way to compare these objects without defining a rule for each property and returning a list of properties that are modified? I know this question seems similar to some earlier posts. I've read a lot of them but as I'm rather inexperienced at programming, I'd appreciate some advice.

From what I've gathered from what I've read, should I be combining 'comparing properties generically' with 'ignoring properties using data annotations' and 'returning a list of CompareResults'?

Leegaert
  • 21
  • 4
  • 1
    To better understand your problem, if you don't have the ID in your CSV file, what are you using as a unique identifier for your users? i.e. when do you decide to not create a new entry? – Yannick Blondeau Nov 25 '13 at 15:19
  • Oh, I'm sorry, I've left out some properties to flatten the example a little bit. I'll edit the example. The unique identifier is an employee number which is of course necessary to decide whether or not to create a new user in the database. – Leegaert Nov 25 '13 at 18:12
  • Have you seen [Compare two objects and find the differences](http://stackoverflow.com/questions/4951233/compare-two-objects-and-find-the-differences)? – Tim S. Nov 25 '13 at 18:53
  • Yes, I have but there I wondered how to handle reference types like Office? Should I implement the IEquatable interface for those types so that I can use the equals method there as well? – Leegaert Nov 25 '13 at 19:19

1 Answers1

0

There are several approaches that you can solve this:
Approach #1 is to create separate DTO-style classes for the contents of the CSV files. Though this involves creating new classes with a lot of similar fields, it decouples the CSV file format from your database and gives you the ability to change them later without influencing the other part. In order to implement the comparison, you could create a Comparer class. As long as the classes are almost identical, the comparison can get all the properties from the DTO class and implement the comparison dynamically (e.g. by creating and evaluating a Lambda expression that contains a BinaryExpression of type Equal).
Approach #2 avoids the DTOs, but uses attributes to mark the properties that are part of the comparison. You'd need to create a custom attribute that you assign to the properties in question. In the compare, you analyze all the properties of the class and filter out the ones that are marked with the attribute. For the comparison of the properties you can use the same approach as in #1. Downside of this approach is that you couple the comparison logic tightly with the data classes. If you'd need to implement several different comparisons, you'd clutter the data classes with the attributes.
Of course, #1 results in a higher effort than #2. I understand that it is not what you are looking for, but maybe having a separate, strongly-typed compared class is also an approach one can think about.
Some more details on a dynamic comparison algorithm: it is based on reflection to get the properties that need to be compared (depending on the approach you get the properties of the DTO or the relevant ones of the data class). Once you have the properties (in case of DTOs, the properties should have the same name and data type), you can create a LamdaExpression and compile and evaluate it dynamically. The following lines show an excerpt of a code sample:

public static bool AreEqual<TDTO, TDATA>(TDTO dto, TDATA data) 
{
    foreach(var prop in typeof(TDTO).GetProperties())
    {
        var dataProp = typeof(TDATA).GetProperty(prop.Name);
        if (dataProp == null)
            throw new InvalidOperationException(string.Format("Property {0} is missing in data class.", prop.Name));
        var compExpr = GetComparisonExpression(prop, dataProp);
        var del = compExpr.Compile();
        if (!(bool)del.DynamicInvoke(dto, data))
            return false;
    }
    return true;
}

private static LambdaExpression GetComparisonExpression(PropertyInfo dtoProp, PropertyInfo dataProp)
{
    var dtoParam = Expression.Parameter(dtoProp.DeclaringType, "dto");
    var dataParam = Expression.Parameter(dataProp.DeclaringType, "data");
    return Expression.Lambda(
        Expression.MakeBinary(ExpressionType.Equal, 
            Expression.MakeMemberAccess(
                dtoParam, dtoProp), 
            Expression.MakeMemberAccess(
                dataParam, dataProp)), dtoParam, dataParam);
}

For the full sample, see this link. Please note that this dynamic approach is just an easy implementation that leaves room for improvement (e.g. there is no check for the data type of the properties). It also does only check for equality and does not collect the properties that are not equal; but that should be easy to transfer.
While the dynamic approach is easy to implement, the risk for runtime errors is bigger than in a strongly-typed approach.

Markus
  • 20,838
  • 4
  • 31
  • 55
  • Thanks for your input Markus. I started out with separate DTO classes, I'll take another look at it. Could you elaborate a little more on the Comparer class, you kind of lost me there. – Leegaert Nov 25 '13 at 19:40
  • Thanks for the example. I must admit that the GetComparisonExpression part is above my head. I'll play a little with it, see if I can wrap my head around it and I'll post an update in a while. – Leegaert Nov 26 '13 at 09:20
  • It looks more complex than it is; it assembles an expression that compares the properties of the DTO and the data class. The lambda expression receives a DTO and a data class as parameters and contains a dynamic expression similar to (dto, dataCls) => dto.PropName == dataCls.PropName where PropName is provided by the PropertyInfo. As the property name is different for each property that is compared, it needs to be dynamic. After the lambda expression is compiled at runtime, it is invoked with the DTO and data class as parameter. – Markus Nov 26 '13 at 15:12