0

I created a program that tests carchive. I wanted to see how fast it took to save a million data points:

#include "stdafx.h"
#include "TestData.h"
#include <iostream>
#include <vector>

using namespace std;

void pause() {
    cin.clear();
    cout << endl << "Press any key to continue...";
    cin.ignore();
}

int _tmain(int argc, _TCHAR* argv[])
{
    int numOfPoint = 1000000;

    printf("Starting test...\n\n");
    vector<TestData>* dataPoints = new vector<TestData>();

    printf("Creating %i points...\n", numOfPoint);
    for (int i = 0; i < numOfPoint; i++)
    {
        TestData* dataPoint = new TestData();
        dataPoints->push_back(*dataPoint);
    }
    printf("Finished creating points.\n\n");

    printf("Creating archive...\n");
    CFile* pFile = new CFile();
    CFileException e;
    TCHAR* fileName = _T("foo.dat");
    ASSERT(pFile != NULL);
    if (!pFile->Open(fileName, CFile::modeCreate | CFile::modeReadWrite | CFile::shareExclusive, &e))
    {
        return -1;
    }

    bool bReading = false;
    CArchive* pArchive = NULL;
    try
    {
        pFile->SeekToBegin();
        UINT uMode = (bReading ? CArchive::load : CArchive::store);
        pArchive = new CArchive(pFile, uMode);
        ASSERT(pArchive != NULL);
    }
    catch (CException* pException)
    {
        return -2;
    }
    printf("Finished creating archive.\n\n");

    //SERIALIZING DATA
    printf("Serializing data...\n");
    for (int i = 0; i < dataPoints->size(); i++)
    {
        dataPoints->at(i).serialize(pArchive);
    }
    printf("Finished serializing data.\n\n");

    printf("Cleaning up...\n");
    pArchive->Close();
    delete pArchive;
    pFile->Close();
    delete pFile;
    printf("Finished cleaning up.\n\n");

    printf("Test Complete.\n");

    pause();

    return 0;
}

When I run this code, it takes some time to create the data points, but then it runs through the rest of the code almost instantly. However, I then have to wait about 4 minutes for the application to actually finish running. I would assume the application would wait hang at the serializing data portion just like it did during the creation of the data points.

So my question is about how this actually work. Does carchive do its thing on a separate thread and allow the rest of the code to execute?

I can provide more information if necessary.

Batman
  • 541
  • 4
  • 25
  • 5
    Don't add elements to your vector that way!! `dataPoints->push_back(*dataPoint);` You are leaking every single element https://stackoverflow.com/questions/9303921/deleting-dereferenced-elements-from-vector – Cory Kramer Jun 24 '15 at 17:41
  • Thanks for the heads up! – Batman Jun 24 '15 at 17:41
  • 4
    You also have no reason to `new` the `std::vector`. – crashmstr Jun 24 '15 at 17:42
  • 1
    Please avoid MFC (It is useless in a console application, and harmful for beginners) –  Jun 24 '15 at 17:51
  • The documentation for `CArchive` ( https://msdn.microsoft.com/en-us/library/caz3zy5s.aspx ) doesn't say anything about running on a different thread. If you don't go through the trouble of calling the functions you want on a different thread, it's not going to do that automatically. – Max Lybbert Jun 24 '15 at 17:52
  • The documentation for `CArchive::Close()` says **Flushes any data remaining in the buffer**, closes the archive, and disconnects the archive from the file. It looks to me like Microsoft is aggressive about keeping data in that buffer. – Max Lybbert Jun 24 '15 at 17:54
  • 1
    There is probably no reason to use `new` at all in this program. Removing those calls to `new` makes those `ASSERT` lines unnecessary, since you would be dealing with objects, not pointers. – PaulMcKenzie Jun 24 '15 at 17:56

1 Answers1

4

If you want to create a vector with a million elements that are all default-initialized you just just use this version of the constructor

vector<TestData> dataPoints{numOfPoint};

You should stop newing everything, let RAII handle the cleanup for you.

Also, know that push_back requires a resize of your vector if it's capacity isn't large enough, so if you start with an empty vector, and know how big it is going to be at the end, you can use reserve ahead of time.

vector<TestData> dataPoints;
dataPoints.reserve(numOfPoint);
for (int i = 0; i < numOfPoint; i++)
{
    dataPoints->push_back(TestData{});
}
Cory Kramer
  • 114,268
  • 16
  • 167
  • 218