4

So I've been using rapidjson in a c++ project of mine, and I've figured out how to use it for my project needs. But while cleaning up the my code I saw that I just assigned a random number for my buffer size.

char readBuffer[80000]; rapidjson::FileReadStream readStream( file, readBuffer, sizeof( readBuffer ) );

Is there a proper way to set how large the readBuffer needs to be?

user3339357
  • 292
  • 3
  • 16
  • vector is always best choice, it can be set to data size easily and dynamically – Ali Kazmi Nov 13 '14 at 05:44
  • can you elaborate? Did you mean something like this? 'std::vectorreadBuffer;' – user3339357 Nov 13 '14 at 05:50
  • since you are using char readBuffer[80000]; it will be translated in vector as vector readbuffer; (zero sized, it can be changed later). – Ali Kazmi Nov 13 '14 at 05:54
  • Right. I see what you're doing. But the problem with that is that, 'FileReadStream' expects the second argument to be a char *, and it can't convert a vector<> to char *. Any other suggestions? – user3339357 Nov 13 '14 at 05:58
  • 2
    I've looked around at how other projects use FileReadStream, and they all use a char buf[65536]. Is that just the magic number? or am I missing something? – user3339357 Nov 13 '14 at 06:01
  • since array size is int and int maximum range is 65536. thats the magic :D – Ali Kazmi Nov 13 '14 at 06:03
  • Maximum range for an `int` or a `short`? – Gillespie Nov 20 '14 at 20:04

1 Answers1

0

FileReadStream reads a chunk of bytes into the user-supplied buffer for each internal iteration. By using this stream concept, it does not need to read the whole JSON file into memory.

The buffer size may affect performance but not correctness.

The "optimal" buffer size is platform and application dependent.

If the size is too small, it will incur more overheads due to increased number of fread() calls.

Often user may use program stack (as in your example) for this buffer, so it cannot be too big as well since stack size is limited. Using a big buffer on stack may be a bigger issue for some embedded systems or applications using a lot of threads.

There are always some parameters that may affect the performance. If your application really needs optimal performance, I think the best way is to do experiments. Otherwise, I think 4096 (page size of most platforms) or above is just fine.

By the way, RapidJSON is open source and this class is really simple. Just read this header file you will know how the buffer is used.

P.S. Using vector<> is not a good practice here. As vector<> needs heap allocation and here only needs a fixed size. Using program stack is cheaper.

Milo Yip
  • 4,902
  • 2
  • 25
  • 27
  • So, I took your advice, and made the buffer size = to 4096. Generally speaking, I'm using it mostly to read settings files which won't be getting too large anyway. thanks! – user3339357 Nov 15 '14 at 23:48