0

My issue at the moment is essentially that I can't seem to be able to write all the data that I want to disk.

What I am trying to do is saving float values in binary format to disk, one after the other. There can be any even number of values between 1.000.000 and 400.000.000 float values. Internally the data is stored in QSharedPointer. I am using the same data in different functions, in different threads. I am making sure, that the access is not changing the values in the field.

What I have is similar to this:

class blubb : public QObject
{
    Q_OBJECT
 public slots:
    void foo(QSharedPointer<float> data, size_t size)
    {
        QFile saveFile("ExampleFileName");
        QDataStream streamer;

        saveFile.open(QIODevice::WriteOnly);

        streamer.setDevice(&saveFile);
        streamer.setFloatingPointPrecision(QDataStream::SinglePrecision);
        streamer.setByteOrder(QDataStream::BigEndian);

        qDebug() << "Now writing: " << size << " elements." << endl;
        qDebug() << "With: " << sizeof (*data.data()) << " byte per entry." << endl;
        qDebug() << "Total space: " << size * sizeof(*data.data()) << " bytes.";        


        for (size_t i = 0; i < size; i++)
        {
            streamer << (data.data())[i];
        }

        qInfo() << "Writing done of " << size << " float values of "
                << sizeof (*data.data()) << " bytes per entry, total written space: "
                << size * sizeof(*data.data());
    }
};

If I have for example 4.000.000 float value in data I get all things as expected: 4.000.000 elements to write, 4 bytes per element and a total of 16.000.000 bytes to write. But the actual amount I find written is: 15.990.784 bytes.

If I try with a total of 2.000.000 elements in data the amount of bytes written is: 7.995.392 instead of 8.000.000. This is consistent and repeatable. The amount of bytes missing is proportional to the amount of entries I have in data.

Fun fact: I have another function in another thread that uses the same QSharedPointer to do this:

class blubber2 : public QObject
{
    Q_OBJECT
public slots:
    void fooOtherThread(QSharedPointer<float> data, size_t size)
    {
        if ((size % 2) != 0)
        {
            std::invalid_argument ex("Data is not evenly sized.");
            throw ex;
        }

        QVector<float> vec1;
        QVector<float> vec2;

        bool toggle = false;

        std::partition_copy(data.data(),
                            data.data()+size,
                            std::back_inserter(vec1),
                            std::back_inserter(vec2),
                            [&toggle](int)
        {
            return toggle = !toggle;
        });

    }
};

The amount of elements in vec1 and vec2 are as expected 2.000.000 if I have 4.000.000 entries in data.

So what am I doing wrong in the first function? Why am I getting a wrong amount of bytes written to file?

EDIT:// This should start the whole thing:

class Starter : public QObject
{
signals: 
    void startSignal(QSharedPointer<float> data, size_t size);

public slots:
    void helperStart()
    {
        size_t size = 2000000;
        QSharedPointer<float> data(new float[size]);
        emit startSignal(data, size);
    }

};
int main (int argc, char ** argv)
{
    QCoreApplication app(argc, argv);
    qRegisterMetaType<QSharedPointer<float>>("QSharedPointer<float>");
    qRegisterMetaType<size_t>("size_t");

    blubb blubbInstance;
    blubber2 blubberInstance;

    QThread blubbThread;
    QThread blubberThread;

    blubbInstance.moveToThread(&blubbThread);
    blubberInstance.moveToThread(&blubberThread);

    blubbThread.start();
    blubberThread.start();


    Starter starterInstance;
    QTimer timer;
    timer.setSingleShot(500);
    QObject::connect(&timer, &QTimer::timeout, &starterInstance, &Starter::helperStart);

    QObject::connect(&starterInstance, &Starter::startSignal, &blubbInstance, &blubb::foo);
    QObject::connect(&starterInstance, &Starter::startSignal, &blubberInstance, &blubber2::fooOtherThread);

    timer.start();
    app.exec();        
}
FreddyKay
  • 275
  • 1
  • 4
  • 13
  • 1) Please make a complete testcase. 2) OOM? That QFile should really be opened in unbuffered mode (and/or regularly flushed). 3) If you close the file, open it back and compare the saved contents, what do you see? – peppe Jul 20 '16 at 16:26
  • I will provide a complete testcase shortly then. I tried to check whether the data is actually written via QDataStream::writeRawData . There was no error. I am closing the QFile and the error persists. I tried flushing the file. The error persists – FreddyKay Jul 20 '16 at 16:36
  • Is the number of missing bytes repeatable? Even with different floating point values? How do you determine the amount written? Do you close the file first? – Martin Bonner supports Monica Jul 20 '16 at 16:42
  • I observe that the missing data for 8000000 bytes is consistent with the data being written in 128k blocks and the last block being lost. If QFile is adaptively changing the block size as the file grows, that *could* explain why the data lost is proportional to file size. What happens if you write 8388608 bytes (8M) instead? – Martin Bonner supports Monica Jul 20 '16 at 16:46
  • @MartinBonner If I attempt to write 8388608 bytes. the amount of bytes on disk are: 8372224. The number of missing bytes is repeatable. I do not quite get what you mean by "different floating point values". I determine the written amount by two methods: 1. I am using ubuntus file properties and second I use octave to read the file and all the values for further analysis. Octave returns the same result, the elements written is lower, than I actually expect it to be. – FreddyKay Jul 20 '16 at 17:00
  • Minimize, minimize, minimize! Is the problem related to your data at all? Does `QSharedPointer` matter? I doubt it. What if you ignore the data and just write zeroes i.e. `streamer << (float)0.0`? Do you need to use a `QObject` and `QThread` explicitly - can't you just run the writing asynchronously via `QtConcurrent::run`? Your examples are far from minimal, and I can't reproduce any of it. – Kuba hasn't forgotten Monica Jul 20 '16 at 18:45
  • `blubber` and `blybber2` are not even remotely the same thing. The former uses correct data but fails to write it all to disk. The latter doesn't do any disk writing, it does what amounts to some pointer math and memcpys. – Kuba hasn't forgotten Monica Jul 20 '16 at 18:47
  • Are you sure that `blubb::foo` has returned? You can't look at the file before you are sure that `foo` is done. – Kuba hasn't forgotten Monica Jul 20 '16 at 18:48
  • Finally, your API is backwards: you should be passing a `QVector` by const reference. It was designed *precisely* as a shared array that's cheap to copy. Internally, it uses a shared data pointer. Your design is half-way between C and C++, it's horrible. But it's not the source of whatever problem you may have. – Kuba hasn't forgotten Monica Jul 20 '16 at 18:50
  • @KubaOber 1. I was asked to provide a testcase. I did, by providing the last code section above, that does essentially what I do. I probably could use QtConcurrent::run. Will investigate. For now QThread seemed simpler. 2. Sorry for the syntax errors, I had to write from the top of my head, as I did not have a IDE at that point yesterday anymore. Will correct it in a minute. – FreddyKay Jul 21 '16 at 07:55
  • @KubaOber 3. I am sure blubb::foo returnes. I am sure that the for loop goes through all entries (0 -> size -1). I am sure to close the QFile after the comment by Martin Bonner. I am sure QDataStream does not give an error (tested it via: QDataStream::WriteRawData and checking for errors). 4. Do I understand correctly, that you mean to tell me to use QVector from the get go, without bothering with the float* fields? The only reason I use the pointer field, is because I am working with hardware in the loop where the hardware driver is giving me a pointer field. – FreddyKay Jul 21 '16 at 07:59

0 Answers0