What is the fastest way to resize std::string?

Question

I'm working on a C++ project (using VS2008) where I will need to load a very large XML file into std::wstring from a file. Presently the following line reserves memory before the data is loaded:

//std::wstring str;
//size_t ncbDataSz = file size in bytes

str.resize(ncbDataSz / sizeof(WCHAR));

But my current issue is that the resize method takes somewhat long time for a larger string size (I just tested it with 3GB of data, in a x64 project, on a desktop PC with 12GB of free RAM and it took about 4-5 seconds to complete.)

So I'm curious, is there's a faster (more optimized) method to resize std::string? I'm asking for Windows only.

`std::string::reserve` reserves, `std::string::resize` also writes to the memory... which you are going to overwrite immediately, I suppose. — LogicStuff, Aug 06 '16 at 08:57
@LogicStuff: `reserve` is a weird beast. It "reserves" the memory but one can't access it directly as a contiguous byte array, right? I need to call `append` on it. In that case it's useless to me for this type optimization. — c00000fd, Aug 07 '16 at 22:05
@c00000fd No, the reserved memory will be contiguous, because every `std::basic_string` is. Otherwise nothing would work. [Similar Q&A about the performance differences](http://stackoverflow.com/a/34511650/3552770) (iterators are not the problem here). — LogicStuff, Aug 07 '16 at 22:18
@LogicStuff: Yes, it is contiguous. But how do I access it as a byte-array? Like I showed below in a comment, this `str.reserve(ncbDataSz / sizeof(WCHAR)); ReadFile(hFile, &str[0], ncbDataSz, szRead, NULL);` would not work as the string would still have size 0 internally. — c00000fd, Aug 07 '16 at 22:43
@c00000fd 1. You don't have bytes, you have `wchar_t`s. 2. That link I posted shows your options... Try range construction first. — LogicStuff, Aug 08 '16 at 06:37

score 4 · Answer 1 · answered Nov 19 '17 at 17:37

You can instantiate basic_string with char_traits which does nothing on assign(count):

#include <string>

struct noinit_char_traits : std::char_traits<char> {
    using std::char_traits<char>::assign;
    static char_type* assign(char_type* p, std::size_t count, char_type a) { return p; }
};

using noinit_string = std::basic_string<char, noinit_char_traits>;

Note that it will also affect functions like basic_string::fill() etc.

score 1 · Answer 2 · answered Aug 06 '16 at 11:28

1

Instead of resizing your input string you could just allocate it using std::string::reserve because resizing also initializes every element.

You could try something like this to see if it improves performance for you:

std::wstring load_file(std::string const& filename)
{
    std::wifstream ifs(filename, std::ios::ate);

    // errno works on POSIX systems not sure about windows
    if(!ifs)
        throw std::runtime_error(std::strerror(errno));

    std::wstring s;
    s.reserve(ifs.tellg()); // allocate but don't initialize
    ifs.seekg(0);

    wchar_t buf[4096];
    while(ifs.read(buf, sizeof(buf)/sizeof(buf[0])))
        s.append(buf, buf + ifs.gcount()); // this will never reallocate

    return s;
}

answered Aug 06 '16 at 11:28

Galik

47,303
4
80
117

Thanks. But what you're doing is you're preallocating two buffers: `s.reserve` and `wchar_t buf` and I'm trying to get away with just one. So after I do `s.reserve` how can I load into it directly? Or `ReadFile(hFile, &s[0], sz, szRead, NULL);` I need to adjust its size, right. But how? – c00000fd Aug 07 '16 at 06:49
@c00000fd You can't read into reserved memory. However the allocation for `buf[4096]` is on the stack which is lightening fast (one instruction?). At the end of the day only measuring can tell you if its faster or not. – Galik Aug 07 '16 at 09:41
I'm talking about GB of data, so stack is out of the question. And loading it in small chunks from a file slows it down even more than my gain of not using `resize`. – c00000fd Aug 07 '16 at 18:27
@c00000fd If you look at my example the stack is only used for the file reading buffer. Not the entire file. File reading is already buffered and the reading task is going to be dictated by the speed of the HDD drive more than anything else. But I suggest you measure it and see if there is an improvement or not. – Galik Aug 07 '16 at 20:07
Yeah I see your example... it's a textbook `std::string` stuff. You're not following what I'm saying. But that's OK. It doesn't seem like I can find anything more optimized than just calling `resize` with the `std::string` template. Otherwise if I go with `reserve` it allocates the buffer, which is good, but then I lose "whatever savings" I gained with `reserve` on calls to `append`. – c00000fd Aug 07 '16 at 21:55
@c00000fd Calling *append* should be much cheaper than copying zeros into the buffer and only then copying the real data tbh. But if this method doesn't work in your tests then use whatever gives you the fastest results. – Galik Aug 07 '16 at 22:04
The fastest result would be for me to duplicate `std::string` source file and make its size variable (`_Mysize` I think) publicly accessible from outside, so I can just set it after calling `reserve`. That would work. I wish it provided this mechanism "out-of-the-box." – c00000fd Aug 07 '16 at 22:09
Also overlooked your last statement about `append` being "much cheaper". So your method: 1) `reserve` on 3GB of data, 2) 3GB/4096 number of calls to load data from a file, 3) 3GB/4096 times to copy mem array to `s`. When with just `resize`: 1) reserve on 3GB of data; 2) Set 3GB of data to 0, 3) One call to read 3GB of data from a file into 's'. I seriously doubt your "much cheaper" statement. Again, my whole goal with this question is to eliminate step 2 in my second example. – c00000fd Aug 07 '16 at 22:53
@c00000fd Well here is the reason I feel this method could be faster. I suspect much of the time will be spent waiting for the `HDD` to find data. Reading from the buffer happens *at the same time* as the `HDD` is seeking for data, so a lot of that copying is essentially free of charge. After all a typical seek time for a `HDD` might be several milliseconds. During that (say) `3ms` your program is going to have time to copy the `HDD` cache to the buffer and then into your string and still have time to do nothing, twiddling its thumbs, waiting for the `HDD` to find more data. – Galik Aug 07 '16 at 23:38
That's the theory but the proof is in the testing. You may need to increase the size of the stack buffer to see the benefits (if there are any to be had). – Galik Aug 07 '16 at 23:38
This is not about HDD. I'm asking about `std::string`. Hard drive or even SSD would obviously create a larger overhead. – c00000fd Aug 07 '16 at 23:43

What is the fastest way to resize std::string?

2 Answers2