0

I'm trying to read a series of values from a binary file, but I won't know what the value types are until runtime.

Simplified example

I have a binary file that is 10 bytes long. The bytes represent, in order, an int, a float, and a short. I don't know this at compile-time, but I do know this at runtime, with an array like this:

        Type[] types = new Type[3];
        types[0] = typeof(int);
        types[1] = typeof(float);
        types[2] = typeof(short);

Question

So now that I have this list, is there a way I can use this information to quickly read in values from a file? The only way I can think of is using a large if block, but it looks really ugly:

        for (int i = 0; i < types.Length; i++)
        {
            if (types[i] == typeof(int))
            {
                int val = binaryfile.ReadInt32();
                //... etc ...
            }
            else if (types[i] == typeof(float))
            {
                float val = binaryfile.ReadSingle();
                //... etc ...
            }
            else if //... etc...
        }

But this is ugly and cumbersome. I'm wondering if I can use the Type information in the types array to somehow "automate" this.

What I've tried

One idea I thought about was reading in the raw bytes into an array, then performing the conversion on the byte array. So let's say my array looks like this:

        byte[] buf = new byte[10] {
            0x40, 0xE2, 0x01, 0x00,
            0x79, 0xE9, 0xF6, 0x42,
            0x39, 0x30 };

This contains the int, float, and short values 123456, 123.456, and 12345, respectively. Now I can do the following:

        fixed (byte* bp = &buf[0])
        {
            int* ip = (int*)bp;
            Console.WriteLine("int ptr: {0}", *ip);
        }

This appears to work well, but there are two problems:

  1. I don't know how to marshal *ip back to the managed domain.
  2. I still can't use my type list, as follows:

        fixed (byte* bp = &buf[0])
        {
            (types[0])* ip = ((types[0])*)bp;      // both errors here
            Console.WriteLine("int ptr: {0}", *ip);
        }
    

This produces two compile-time errors on the line indicated:

Error   1   Invalid expression term ')'
Error   2   ) expected

That's all I've thought of to try so far.

I hope someone can help. I feel like I'm missing something simple that would make my life a lot easier.

Update

I've tried Peter Duniho's suggestion and it seems to work quite well, although there is a small performance hit when compared to a large if block.

Here are some results from a ~100 MB file (all times are in ms):

Peter's method:

2025
2003
1954
1979
1958

if block:

1531
1488
1486
1489

Nothing too significant, although since I plan to work with much, much larger files (in the GB range) those few hundred milliseconds add up, so I'm going to stick with the ugly if block until I find something as fast.

Reticulated Spline
  • 1,892
  • 1
  • 18
  • 20

1 Answers1

1

I'm not 100% sure I understand which part of this problem you're actually trying to solve. But based on what I think you're asking, this is how I'd do it:

class Program
{
    static readonly Dictionary<Type, Func<byte[], int, Tuple<object, int>>> _converters =
        new Dictionary<Type, Func<byte[], int, Tuple<object, int>>>
        {
            { typeof(int), (rgb, ib) =>
                Tuple.Create((object)BitConverter.ToInt32(rgb, ib), sizeof(int)) },
            { typeof(float), (rgb, ib) =>
                Tuple.Create((object)BitConverter.ToSingle(rgb, ib), sizeof(float)) },
            { typeof(short), (rgb, ib) =>
                Tuple.Create((object)BitConverter.ToInt16(rgb, ib), sizeof(short)) },
        };

    static void Main(string[] args)
    {
        Type[] typeMap = { typeof(int), typeof(float), typeof(short) };
        byte[] inputBuffer =
            { 0x40, 0xE2, 0x01, 0x00, 0x79, 0xE9, 0xF6, 0x42, 0x39, 0x30 };
        int ib = 0, objectIndex = 0;

        while (ib < inputBuffer.Length)
        {
            Tuple<object, int> current =
                _converters[typeMap[objectIndex++]](inputBuffer, ib);
            Console.WriteLine("Value: " + current.Item1);
            ib += current.Item2;
        }
    }
}
Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • Wow, that certainly is... interesting, to say the least! Took me a bit of working through it to understand exactly what's going on. How much overhead would this incur, though? Dictionary lookup, lambda functions, boxing and unboxing, seems like it would be a computationally expensive process. I guess I'll have to try it for myself to see how it works compared to just using an `if` block. – Reticulated Spline Nov 07 '14 at 03:55
  • Dictionary lookups are _fast_. The anonymous methods incur a type-initialization overhead, but are efficient otherwise. The boxing is there solely for demonstration purposes; if you didn't have boxing in your original implementation, you should be able to apply the above technique but without the boxing (i.e. invoke whatever logic you were doing before in the conditional blocks). That said, even if you had to deal with boxing, that isn't likely to be an issue at all in typical I/O scenarios (file, network, etc.). – Peter Duniho Nov 07 '14 at 04:00