Optimizing loop that has QString manipulations

Question

In my project, I wrote a function to decompress a QString that is compressed using a very basic compression format I wrote in a seperate function. But after some testing, I've found this function to be the thing causing massive slowdowns since it is run on huge QString's and called over 2900 times.

I've been trying to change this function so it runs faster. I've dabbled with QStringRef's with no good results (I might have done it wrong). QByteArrays and QByteRefs are hard to work with and check values (imo).

I really need some help optimizing this function so it runs FAST! As fast as possible! I believe the constant calling of .mid is slowing things down, but I just don't know any other way to read/write the bytes.

EDIT: Better question, is there a common practice I'm missing when it come to decompression functions? I use zlib later in this same program and it compresses faster than this simple function I wrote below. Why is that? What does zlib do differently?

Thanks for your time in advance. :)

Here is what a very small compressed QString would look like:

//Compressed
//This QString is just a hexadecimal representation of a QByteArray
//
QString com("010203ff0504ff0a05ff00ff01ff02ff0306);

And, here is what the same QString would be after decompression:

//Decompressed
QString decom("0102030404040404040505050505050505050505ffffffffffff06060606);

Sorry if you don't understand the format right away... it doesn't matter THAT much. Maybe this helps:

-a byte with "ff" tells us we're about to decompress
-the byte after "ff" is the number of times to repeat the NEXT byte + 1
-UNLESS that number is 0, 1, or 2, then "ff" is the value to be repeated

Examples:
-"010203" decompressed is "010203"

-"ff0401" decompressed is "0101010101"

-"ff02" decompressed is "ffffff"

This is the decompression function I wrote:

int HexToIntS(QString num_hex)  //converts the byte to a number
{
    uint num_uint;
    bool ok;
    num_uint = num_hex.toUInt(&ok,16);
    return (int)num_uint;
}
void Decompress(QString com, QString &decom)
{
    QString c;                 //current byte
    QString n;                 //new/next byte
    int bytePos(0);            //current position in QString
    int byteRepeat;            //number of times to repeat byte n

    c = com.mid(bytePos, 2);   //get first byte (01)
    decom.clear();             //clear decom just in case it had values prior

    do
    {
        bytePos = bytePos + 2;      //move the current position to the next byte
        if(c == "ff")               //is decompression happening?
        {
            c = com.mid(bytePos, 2);   //current byte is now the "next" byte
            byteRepeat = HexToIntS(c); //c tells us how many times the NEXT byte needs to be repeated

            if(byteRepeat <= 2)        //if c's value is <= 2... then ff is the value
            {
                n = "ff";              //new byte is just ff
                bytePos = bytePos + 2; //update the current position
            }
            else                       //if not, then c is the number of times the NEXT byte should be appended
            {
                n = com.mid(bytePos + 2, 2); //new byte is the NEXT byte
                bytePos = bytePos + 4;       //update the current position
            }

            for(int j = 0; j<=byteRepeat; j++)//append n the correct number of times
                decom.append(n);
        }
        else                   //guess we're not decompressing, so just append c
            decom.append(c);
        c = com.mid(bytePos, 2);   //get the new current byte
    }while(bytePos < com.length());  //stop when all bytes were read
}

Current optimized function based on your comments: (5%-10% faster in debug mode only)

void Decompress2(const QString com, QString &decom)
{
    QStringRef c;
    QString n;
    int bytePos(0);
    int byteRepeat;

    c = com.midRef(bytePos, 2);
    decom.clear();

    do
    {
        bytePos = bytePos + 2;
        if(c == "ff")
        {
            c = com.midRef(bytePos, 2);
            byteRepeat = c.toString().toInt(0,16);

            if(byteRepeat <= 2)
            {
                n = "ff";
                bytePos = bytePos + 2;
            }
            else
            {
                n = com.mid(bytePos + 2, 2);
                bytePos = bytePos + 4;
            }

            for(int j = 0; j<=byteRepeat; j++)
                decom.append(n);
        }
        else
            decom.append(c);
        c = com.midRef(bytePos, 2);
    }while(bytePos < com.length());
}

You might use a profiler to see which part of your function "eats" much time and try to optimize/change/replace that part with equivalent code that works faster. — vahancho, Jun 04 '15 at 10:28
Yeah I was looking into that. I will do that next for my whole worker thread soon. I've been commenting out portions of my program all night, this function is without a doubt causing huge slowdowns. Anything right off the bat seem wrong to you? — mrg95, Jun 04 '15 at 10:30
I made c a QStringRef and changed com.mid to com.midRef. There was about a 10% performance boost, but I'm gonna need more than that. — mrg95, Jun 04 '15 at 10:46
Other things to do are: 1) use a `const&` for the com parameter. 2) Definitely move to QByteArray: why is it harder than QString for you? — alediaferia, Jun 04 '15 at 10:59
Also, I'd rather make sure `HexToIntS` is **inline**. Plus, bare in mind string to int conversion is slow. — alediaferia, Jun 04 '15 at 11:04
Okay I will switch to a QByteArray. I like to manipulate/check the data in if statements and I haven't found a good way to just "compare" the hex representation of a single byte (or bytes) to something like "ff". Basically, I have a much harder time "visualizing" QByteArrays and checking what they contain. — mrg95, Jun 04 '15 at 11:05
HexToIntS is used all around my whole program. I will copy it inline however. Is there a faster way to get the int value of a say "0b" — mrg95, Jun 04 '15 at 11:07
You might want to consider doing all of this on QByteArrays instead of QString. — Frank Osterfeld, Jun 04 '15 at 11:48
Working on making a function with bytearrays now. If com becomes a QByteArray, and c becomes a QByteRef... how would I check if c == "ff"? — mrg95, Jun 04 '15 at 11:52
`byteRepeat = c.toString().toInt(0,16);` why not just `byteRepeat = c.toInt(0,16);` ? — Thomas Ayoub, Jun 04 '15 at 12:10
I'm voting to close this question as off-topic because it's about optimizing working code and belongs to [Code Review](http://codereview.stackexchange.com/) — Thomas Ayoub, Jun 04 '15 at 12:11
By the way, NEVER EVER test performance in debug mode. Always in Release with optimization. — Thomas Ayoub, Jun 04 '15 at 12:12
toInt isn't a member of QStringRef. And Thomas, I also asked what zlib and other decompressors do differently. What common practices are there that I'm missing? *Noted about debug mode. Thanks. — mrg95, Jun 04 '15 at 12:13
@mc360pro In the doc I read [this](http://doc.qt.io/qt-5/qstringref.html#toInt)... which version of Qt are you using? — Thomas Ayoub, Jun 04 '15 at 12:20
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/79660/discussion-between-thomas-and-mc360pro). — Thomas Ayoub, Jun 04 '15 at 12:28
Did you read the documentation? In particular, the bit about [More Efficient String Construction](http://doc.qt.io/qt-4.8/qstring.html#more-efficient-string-construction)? (It's unchanged in Qt 5) — Toby Speight, Jun 04 '15 at 12:35
@Thomas I was able to remake the function using QByteArrays. That plus the new method I was talking about has eradicated the slowdown issues by 90%. Thanks :) — mrg95, Jun 05 '15 at 01:03
@Thomas I'm trying to work with QByteArrays, but I keep running into walls. I'm trying really hard and doing lots of doc reading. They're new territory for me. Mind taking a look at http://stackoverflow.com/questions/30660127/append-quint16-unsigned-short-to-qbytearray-quickly ? — mrg95, Jun 05 '15 at 06:58

score 2 · Answer 1 · answered Jun 04 '15 at 12:51

You shouldn't be treating an array of bytes as a string. That's silly and as you noted, dead slow. Use the raw byte values instead and operate on them.

I know I shouldn't be writing the code for other people, but I had absolutely nothing better to do, so here it is in straight C++. I know you're using Qt and I'm fairly certain most of the below code has some equivalent in terms of Qt's ByteArray, but that's something you can figure out then, if pure C++ is not an option.

#include <vector>
#include <cstdint>
#include <iomanip>
#include <iostream>

std::vector<std::uint8_t> decompress(const std::vector<std::uint8_t>& com)
{
  std::vector<std::uint8_t> decom;
  decom.reserve(com.size()); // a conservative estimate of the required size

  for(auto it = begin(com); it != end(com); ++it)
  {
    if(*it == 0xff)
    {
      ++it;
      if(it != end(com))
      {
        std::uint8_t number_of_repeats = *it;
        if(number_of_repeats <= 2)
        {
          std::fill_n(std::back_inserter(decom), number_of_repeats, 0xff);
          continue;
        }
        else
        {
          ++it;
          if(it != end(com))
            std::fill_n(std::back_inserter(decom), number_of_repeats, *it);
          else
            throw 42; // handle error in some way
        }
      }
      else 
        throw 42; // handle error in some way
    }
    else
      decom.push_back(*it);
  }
  return decom;
}
int main()
{
  std::vector<std::uint8_t> com{0x01, 0x02, 0x03, 0xff, 0x05, 0x04, 0xff, 0x0a, 0x05, 0xff, 0x00, 0xff, 0x01, 0xff, 0x02, 0xff, 0x03, 0x06};


  for(const auto& value : com)
    std::cout << std::hex << std::setfill('0') << std::setw(2) << static_cast<unsigned short>(value) << ' ';
  std::cout << '\n';
  auto result = decompress(com);

  for(const auto& value : result)
    std::cout << std::hex << std::setfill('0') << std::setw(2) << static_cast<unsigned short>(value) << ' ';
}

Live demo here. I take no responsibility with respect to correctness, efficiency or otherwise usability of this code. It was written in under five minutes.

Note that I believe your decompressed string in your long example is wrong. According to your rules, it should be

01 02 03 04 04 04 04 04 05 05 05 05 05 05 05 05 05 05 ff ff ff 06 06 06

Starting from the back that is 06 repeated three times, then 2 times ff, then 1 time ff then 0 times ff, then the rest.

Thank you so much for going above and beyond with your answer. I will combine that with a new algorithm I constructed and let you know how it goes. And no, the decompression was correct. The number to repeat is the next byte + 1. So ff0306 would be 06060606 (03 + 1) — mrg95, Jun 04 '15 at 20:44
Ah, seems I missed the plus 1 then :). And you're very welcome! Don't forget to upvote/accept if you feel my answer deserves it ;). — rubenvb, Jun 04 '15 at 20:48

Optimizing loop that has QString manipulations

1 Answers1