-3

I have asked this question over the last 2 years and am still looking for a good way of doing this. What I am doing is as follows:

I have a WPF/C# application which has been developed over the last 3 years. It takes a real time stream of bytes over a UDP port. Each record set is 1000 bytes. I am getting 100 of these byte records per second. I am reading the data and processing it for display in various formats. These logical records are sub-commutated.

The first 300 bytes are the same each logical record contain a mixture of Byte, Int16, UInt16, Int32 and UInt32 values. About 70% of these values are eventually multiplied by an least significant bit to create a Double. These parameters are always the same. The second 300 bytes are another mixture of Byte, Int16, UIn32, Int32 and UInt32 values. Again about 70% of these values are multiplied by an LSB to create a Double. These parameters are again always the same. The last segment is 400 bytes and sub-commutated. This means that the last part of the record contains 1 of 20 different logical record formats. I call them Type01...Type20 data. There is an identifier byte which tells me which one it is. These again contain Byte, Int, UInt data values which need to be converted.

I am currently using hundreds of function calls to process this data. Each function call takes the 1000 byte array as a parameter, an offset (index) into the byte array to where the parameter starts. It then uses the BitConverter.ToXXX call to convert the bytes to the correct data type, and then if necessary multiply by an LSB to create the final data value and return it.

I am trying to streamline this processing because the data stream are changing based on the source. For instance one of the new data sources (feeds) changes about 20 parameters in the first 300 bytes, about 24 parameters in the second 300 bytes and several in the last sub-commutated 400 bytes records.

I would like to build a data dictionary where the dictionary contains the logical record number (type of data), offset into the record, LSB of data, type of data to be converted to (Int16, UInt32, etc) and finally output type (Int32, Double, etc). Maybe also include the BitConverter function to use and "cast it dynamically"?

This appears to be a exercise in using Template Classes and possibly Delegates but I do not know how to do this. I would appreciate some code as in example.

The data is also recorded so playback may run at 2x, 4x, 8x, 16x speeds. Now before someone comments on how you can look at thousands of parameters at those speeds, it is not as hard as one may think. Some types of data such as green background for good, red for bad; or plotting map positions (LAT/LON) over time lend themselves very well for fast playback to find interesting events. So it is possible.

Thanks in advance for any help.

I am not sure others have an idea of what I am trying to do so I thought I would post a small segment of source code to see if anyone can improve on it.

Like I said above, the data comes in byte streams. Once it is read in a Byte Array it looks like the following:

Byte[] InputBuffer = { 0x01, 0x00, 0x4F, 0xEB, 0x06, 0x00, 0x17, 0x00, 
0x00, 0x00, ...    };

The first 2 bytes are an ushort which equals 1. This is the record type for this particular record. This number can range from 1 to 20.

The next 4 bytes are an uint which equals 453,455. This value is the number of tenths of a seconds. Value in this case is 12:35:45.5. To arrive at this I would make the following call to the following subroutine:

labelTimeDisplay.Content = TimeField(InputBuffer, 2, .1).ToString();

public Double TimeField(Byte[] InputBuffer, Int32 Offset, Double lsb)
{
   return BitConverter.ToUInt32(InputBuffer, Offset) * lsb;
}

The next data field is the software version, in this case 23

labelSoftwareVersion.Content = SoftwareVersion(InputBuffer, 6).ToString();

public UInt16 SoftwareVersion(Byte[] InputBuffer, Int32 Offset)
{
   return BitConverter.ToUInt16(InputBuffer, Offset);
}

The next data field is the System Status Word another UInt16.

Built-In-Test status bits are passed to other routines if any of the 16 bits are set to logic 1.

UInt16 CheckStatus = SystemStatus(InputBuffer, 8);

public UInt16 SystemStatus(Byte[] InputBuffer, Int32 Offset)
{
   return BitConverter.ToUInt16(InputBuffer, Offset);
}

I literally have over a thousand of individual subroutines to process the data stored in the array of bytes. The array of bytes are always fixed length of 1000 bytes. The first 6 bytes are always the same, identifier and time. After that the parameters are different for every frame.

I have some major modifications coming the software which will redefine many of the parameters for the next software version. I still have to support the old software versions so the software just gets more complicated. My goal is to find a way to process the data using a dictionary lookup. That way I can just create the dictionary and read the dictionary to know how to process the data. Maybe use loops to load the data into a collection and then bind it to the display fields.

Something like this:

public class ParameterDefinition
{
    String ParameterNumber;
    String ParameterName;
    Int32 Offset;
    Double lsb;
    Type ReturnDataType;
    Type BaseDataType;
}

private ParameterDefinition[] parms = new ParameterDefinition[]
{
   new ParameterDefinition ( "0000","RecordID",  0, 0.0,  typeof(UInt16), typeof(UInt16)), 
   new ParameterDefinition ( "0001",    "Time",  2, 0.1,  typeof(Double), typeof(UInt32)), 
   new ParameterDefinition ( "0002",   "SW ID",  6, 0.0,  typeof(UInt16), typeof(UInt16)),
   new ParameterDefinition ( "0003",  "Status",  8, 0.0,  typeof(UInt16), typeof(UInt16)),  
   // Lots more parameters
}

My bottom line problem is getting the parameter definitions to cast or select the right functions. I cannot find a way to link the "dictionary" to actual data ouputs

Thanks for any help

Pantera
  • 11
  • 2
  • What are these "Template Classes" you speak of? – user2864740 Jul 22 '15 at 20:53
  • 1
    Your question is too broad/generic. Profile your code, find the most costly ones and ask them.(with a simple, reproducible code) – EZI Jul 22 '15 at 20:56
  • What does the data represent? – Contango Jul 22 '15 at 20:57
  • @EZI: Actually I think the question can be answered well. He's trying to manually deserialize a byte stream, which is a problem that Google Protocol Buffers solves very well. Bypasses most of the performance and complexity concerns that he has today. – Eric J. Jul 22 '15 at 21:09
  • @EricJ. I see your answer. No need to repeat it here again. If I though it were correct I would have voted it up. – EZI Jul 22 '15 at 21:30
  • @EZI: I had no idea whether you were still watching this question. If you feel my answer is not correct, I would like to hear why. – Eric J. Jul 22 '15 at 21:47
  • @EricJ. Maybe I worded it wrong. Your answer would work, but I am not sure it is the best way to do it. Hard to say without any sample data/code provided by OP. So I just consider your answer as a suggestion instead of a fact. – EZI Jul 22 '15 at 21:56
  • EZI, I disagree that this is too broad or generic. Converting a byte array to a series of other types seems straight forward. The represented data is irrelevant as Contango mentions. – Pantera Jul 23 '15 at 02:36
  • In C I would just create a type struct and cast the byte array. I see where others have done something similar using Marshal.PtrToStructure, but it is only part of the answer. – Pantera Jul 23 '15 at 02:43

2 Answers2

1

Using a data dictionary to represent the data structure is fine, as long as you don't walk the dictionary for each individual record. Instead, use Reflection Emit or Expression trees to build a delegate that you can call many many times.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
0

It sounds like you are manually deserializing a byte stream, where the bytes represent various data types. That problem has been solved before.

Try defining a class that represents the first 600 bytes and deserialize that and deserialize it using Protocol Buffer Serializer (that implementation is by SO's own Marc Gravell, and there is a different implementation by top SO contributer Jon Skeet).

Protocol buffers are language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols and data storage. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. You can even update your data structure without breaking deployed programs that are compiled against the "old" format.

Source, as well as a 3rd implementation I have not personally used.

For the last 300 bytes, create appropriate class definitions for the appropriate formats, and again use protocol buffer to deserialize an appropriate class.

For the final touch-ups (e.g. converting values to doubles) you can either post-process the classes, or just have a getter that returns the appropriate final number.

Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • I have downloaded the Protocol Buffer Serializer and am looking at it. – Pantera Jul 23 '15 at 02:33
  • Protocol Buffers are a good approach if you're allowed to change the on-wire encoding. If you have to interoperate with an existing protocol, they can't help. – Ben Voigt Jul 23 '15 at 14:35
  • Eric, I spent the last 3 hours looking at Protocol Buffer Serializer. I can see where it handles all types of data streams, but I think it is way too much overkill. Much of the code is currently a mystery to me but I can see that 90% of the data stream inputs and outputs are irrelevant. I only need to process and input stream of bytes and which are only 1 of 5 data types, byte, short, ushort, int, uint embedded in the stream. I believe that short and ushort are not even in the addressed data types. – Pantera Jul 23 '15 at 23:55
  • I will look at Ben Voigt's suggestion wrt Reflection.Emit. – Pantera Jul 23 '15 at 23:58