5

A discussion has come up at work:

We've got a class that has an IList. Fact is an abstract base class, and there are several concrete subclasses (PopulationFact, GdpFact, etc).

Originally we'd query for a given fact in this fashion, i.e. by type:

.Facts.FirstOrDefault(x => x.Year == 2011 && x is GdpFact)

Now, however, the question's been raised whether we should introduce a FactType enum instead to do

.Facts.FirstOrDefault(x => x.Year == 2011 && x.FactType == FactType.Gdp)

The suggestion has been raised because it is supposedly faster. I'll admit that I've not written any tests to try and discern the difference in performance, but I have two questions:

1) Is 'querying on type' like this inherently bad?
2) Isn't adding a FactType enum just superfluous anyway, given that facts are strongly typed?

UPDATE To clarify, this is LINQ to objects and GdpFact:Fact.

UPDATE 2 We've measured using current typical data (4 facts) and the results are in:

lookup on enum: 0.29660000000000003 milliseconds lookup on type: 0.24530000000000002 milliseconds

So type lookup is faster in this context! I will choose my accepted answer carefully.

Øyvind
  • 1,600
  • 1
  • 14
  • 33
  • 3
    Is it linq to objects or linq to SQL? Or, more generally, are you calling Enumerable.FirstOrDefault or Queryable.FirstOrDefault? Also, note that the two approaches are not equivalent. If you have fact types derived from GdpFact, `x is GdpFact` will be true for instances of these types, but `x.FactType == FactType.Gdp` will likely be false. – phoog Nov 28 '12 at 16:09
  • Well, what type is `Facts`, `IList` where `GdpFact:Fact`? This seems logical but you don't say. – Jodrell Nov 28 '12 at 16:09
  • 4
    `.Facts.OfType().FirstOrDefault(x => x.Year == 2011)` is an option as well. – Austin Salonen Nov 28 '12 at 16:09
  • Whatever anybody asserts it should be backed up by a test – Jodrell Nov 28 '12 at 16:12
  • @phoog this is LINQ to objects. And all subclasses of Fact are sealed so we're not worried about subclasses of subclasses, but I get your point. – Øyvind Nov 28 '12 at 16:14
  • @Jodrell, it's GdpFact:Fact like you assumed. – Øyvind Nov 28 '12 at 16:14
  • Adding to phoog's comment, here is how you would compare types by excluding derived types: `x.GetType() == typeof(GdpFact)`. – Olivier Jacot-Descombes Nov 28 '12 at 16:19
  • I don't see how LINQ has anything to do with this. If you are looking for performance data of the `is` operator compared to an `enum` test, you might want to take a look at this possible duplicate: http://stackoverflow.com/questions/686412/c-sharp-is-operator-performance – Good Night Nerd Pride Nov 28 '12 at 16:26
  • How many facts will be in the list? if it's only a small number then the overhead of the type check will be fairly insignificant, especially compared to the maintenance overhead of adding the FactType enum and keeping it up to date. – Trevor Pilley Nov 28 '12 at 16:26
  • 2
    Have you considered the `.OfType()` method? Not sure how that's implemented and therefore performance implication but it allows for nice declarative code http://msdn.microsoft.com/en-us/library/bb360913.aspx – Tim Croydon Nov 28 '12 at 16:26
  • @Øyvind in that case, I'm inclined to agree with the answers: go with the simpler solution (type checking, not enum). If performance is problematic, use a profiler and go from there. I suspect you'll find performance bottlenecks elsewhere before you find them here. Also, since you don't have to worry about sub-subclasses, you could try Olivier Jacot-Descombes`s suggestion, which is probably faster than `is` because it doesn't have to look at the type's inheritance chain. – phoog Nov 28 '12 at 16:30
  • @TimCroydon - thanks for the .OfType() tip :-) – Øyvind Nov 28 '12 at 16:50
  • @phoog I agree. KISS and all that. – Øyvind Nov 28 '12 at 16:50
  • I conclude that the type appraoch is both faster and simpler, http://stackoverflow.comhttp://stackoverflow.com/a/13611052/659190 – Jodrell Nov 28 '12 at 17:58
  • @jodrell that is a brilliant answer. Thank you! – Øyvind Nov 28 '12 at 18:09

5 Answers5

3

All type of preformance related questions are strictly dependent on concrete application context, so may be answer(s) provided here will be partially right/wrong for your concrete case.

Having this in mind:

Checking enum value should be reasonably faster then checking for type, as in first case, you just check for equality 2 integers (enum values).

But that introduce one more field in the object, that has to be tracked to have correct value (unit test), which you do not need in second case, as CLR cares about correct type initialization.

I think it's better that you profile your ideas against relevant amount of data, that your app usually operates over, and will come out with correct idea for you.

Tigran
  • 61,654
  • 8
  • 86
  • 123
  • thanks. Good answer. Context matters, no doubt. In our case there are only a handful of Facts to deal with at a time (no more than 10-20). – Øyvind Nov 28 '12 at 16:17
  • at this point, I think, there would be no relevant(notable) difference between these 2 cases. By the way, measure. – Tigran Nov 28 '12 at 16:20
3

I've done a test, my results for 1000000 iterations are approximately

ByCast 166ms
ByType 84ms
ByEnum 98ms

So the enum is in fact superfluous and slower but, not by much. This should not be too suprising, the type system is fundamental to the .Net Framework.

Test code transcribed below, apologies for errata

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

class Program
{
    private enum TypeOfFact
    {
        Gdp,
        Other
    }

    private abstract class Fact
    {
        public virtual int Year { get; set; }
        public abstract TypeOfFact FactType { get; }
    }

    private class GdpFact : Fact
    {
        public override TypeOfFact FactType
        {
            get { return TypeOfFact.Gdp; }
        }
    }

    private class OtherFact : Fact
    {
        public override TypeOfFact FactType
        {
            get { return TypeOfFact.Other; }
        }
    }

    static void Main()
    {
        Ilist<Fact> facts = new List<Fact>
            {
                new GdpFact { Year = 2010 },
                new OtherFact { Year = 2010 },
                new GdpFact { Year = 2009 },
                new OtherFact { Year = 2009 },
                new GdpFact { Year = 2011 },
                new OtherFact { Year = 2011 },
            };

        const int interations = 1000000;

        var funcs = new List<Func<IList<Fact>, Fact>>
            {
                ByList,
                ByType,
                ByEnum
            };

        // Warmup
        foreach (var func in funcs)
        {
           Measure(5, func, facts);
        }

        // Results
        foreach (var result in funcs.Select(f => new
            {
                Description = f.Method.Name,
                Ms = Measure(iterations, f, facts)
            }))
        {
            Console.WriteLine(
                "{0} time = {1}ms",
                result.Description,
                result.Ms);
        }
    }

    private static long Measure(
        int iterations,
        Func<IList<Fact>, Fact> func,
        IList<Fact> facts)
    {
        var stopwatch = new Stopwatch();
        stopwatch.Start();
        for (var i = 0; i < iterations; i++)
        {
            func.Invoke(facts);
        }

        stopwatch.Stop();
        return stopwatch.ElapsedMilliseconds;
    }

    private static Fact ByType(IList<Fact> facts)
    {
        return facts.FirstOrDefault(f =>
            f.Year == 2011 && f is GdpFact);
    }

    private static Fact ByEnum(IList<Fact> facts)
    {
        return facts.FirstOrDefault(f =>
            f.Year == 2011 && f.FactType == TypeOfFact.Gdp);
    }

    private static Fact ByCast(IList<Fact> facts)
    {
        return facts.OfType<GdpFact>()
            .FirstOrDefault(f => f.Year == 2011);
    }
}

This question seems relevant.

Community
  • 1
  • 1
Jodrell
  • 34,946
  • 5
  • 87
  • 124
2

Is this maybe a solution looking for a problem?

I think having both concrete subtypes and an enum in the base type could potentially obfuscate your design. You could imagine someone coming along later and writing a new concrete class but not realising they needed to add to the enum as well...

Unless you find you have a specific problem to do with performance, I'd be tempted to prioritise clarity instead. Therefore if you need different concrete classes (and I'm assuming you do since that's how you've coded it to start with) then I'd stick with your types rather than move to an enum.

PeteH
  • 2,394
  • 1
  • 20
  • 29
2

I think your original approach is fine. the 'is' keyword is provided for this purpose. MSDN does not discourage the use of 'is'. Using an enum seems to be over-engineering. We should try and keep code simple. Fewer the lines of code the better it is in most situations.

Bull
  • 701
  • 1
  • 6
  • 13
  • thanks. I think it's overkill, too - and suspected the "community" would feel that way as well... posting on SO was a way of building 'ammunition' :-) – Øyvind Nov 28 '12 at 16:36
  • I know what you mean :) Good luck for your discussions :) – Bull Nov 29 '12 at 14:53
2

It's possible that checking an enum value will be faster than a runtime type-check, but...

  1. It's a micro-optimisation -- you're unlikely to notice much perf difference in real-world scenarios.
  2. It makes things more complicated, and more complicated means more likely to break.
    For example, what's to stop you, or one of your colleagues, accidentally doing something like this?

    public class PopulationFact : Fact
    {
        public FactType FactType = FactType.GdpFact;  // should be PopulationFact
    }
    

I'd stick with the type-check. There's actually a built-in LINQ method that'll do it for you:

.Facts.FirstOrDefault(x => x.Year == 2011).OfType<GdpFact>()
LukeH
  • 263,068
  • 57
  • 365
  • 409