0

I have a program that in outline processes binary data from afile.

Code outline is the following:

QFile fileIn ("the_file");
fileIn.open(QIODevice::ReadOnly);

The file has a mix of binary and text data. The file contents are read using QDataStream:

QDataStream stream(&fileIn);
stream.setByteOrder(QDataStream::LittleEndian);
stream.setVersion(QDataStream::Qt_5_0);

I can read the data from the QDataStream into various data types. e.g.

QString the_value;  // String
stream >> the_value;
qint32 the_num;
stream >> the_numm;

Nice and easy. Overall I read the file data byte by byte until I hit certain values that represent delimiters, e.g. 0x68 0x48. At this point I then next the next couple of bytes that tell me what type of data is next (floats, Strings, ints, etc) and extract as appropriate.

So, the data is orocessed (outline) like:

while ( ! stream.atEnd() )
{
    qint8 byte1 = getInt8(stream);
    qint8 byte2 = getInt8(stream);
    if ( byte1 == 0x68 && byte2 == 0x48 )   
    {
        qint8 byte3 = getInt8(stream);
        qint8 byte4 = getInt8(stream);
        if ( byte3 == 0x1 && byte4 == 0x7 )
        {
            do_this(stream);
        } 
        else if ( byte3 == 0x2 && byte4 == 0x8 )
        {
            do_that(stream);
        }
    }
}

Some of this embedded data may be compressed, so we use

long dSize = 1024;
QByteArray dS = qUncompress( stream.device()->read(dSize));

QBuffer buffer;
buffer.setData(dS);

if (!buffer.open(QBuffer::ReadOnly)) {
    qFatal("Buffer could not be opened. Something is very wrong!");
}

QDataStream stream2(&buffer);
stream2.setByteOrder(QDataStream::LittleEndian);
stream2.setVersion(QDataStream::Qt_5_0);

The convenience of QDataStream makes it easy to read the data, in terms of mapping to particular types but also in handling endianess easily, but it seems to be at the expense of speed. The issues is compounded by the fact that the processing is recursive - data being read could itself contain embedded file data, which needs to be read and processed in the same way.

Is there an alternative that is faster, and so if, how then to handle Endianess the same way?

TenG
  • 3,843
  • 2
  • 25
  • 42
  • 1
    "But it seems to be at the expense of speed". It seems some functionality is at expense of speed? It seems that one? Maybe it makes sense to investigate what code is responsible for being slow? That leads to avoiding premature optimization which is bad. Can you do some profiling? – Alexander V Apr 14 '16 at 00:45

1 Answers1

1

Your code looks straight forward .. recursion should not be the show stopper ...

Do you have lots of strings ? Thousands ?

stream >> string allocates memory using new what is really slow. And needs to be freed manually afterwards. Refer to the Qt Docs for operator>>(char *&s) method. This is used when reading into QStrings.

Same is true for readBytes(char *&s, uint &l) which may be called internally slowing everything down !

The QString itself will also allocate memory (twice as much as it uses 16bit encoding) what slows down further.

If you use one of these functions often, consider rewriting that code parts for directly reading into a preallocated buffer using readRawData(char *s, int len) before further processing.

Overall, if you need high performance QDataStream itself may well be the show stopper.

Aaron
  • 1,181
  • 7
  • 15