3

I made a program to read and write 2d-array into NVME SSD(Samsung 970EVO plus).

I designed the program to read N*M like as

#pragma omp parallel for
for(int i=0;i<N;i++)
  fstream.read(...) // read M bytes

but, this code shows lower performance(KB/s) than SSD specification(< GB/s)

I think if size M is larger than block-size(maybe 4KB) and multiple of 2, that code will show GB/s performance.

However, it isn't. I think I missed something.

Are there some c++ codes for maximizing I/O performance on SSD?

  • The GB/s specifications are kinda irrelevant as it is limited by the throughput of all the intermediate hardware (e.g. your motherboard, if using onboard SATA) and how fast your PC can actually consume the data read. E.g. mine seems to max out at about 350 MB/s – M.M Dec 16 '19 at 03:36
  • 1
    This is not really a C++ question, it will come down to operating system and hardware facilities – M.M Dec 16 '19 at 03:36
  • How much data are you reading? KB/s is pretty normal for reading small amounts of data (maybe not for an ssd?). I don't think making the program parallel will help either.. if I had to guess the SSD can only transfer one request at a time anyway. – Brady Dean Dec 16 '19 at 03:37
  • @BradyDean M is 4MB and N is almost 1024 – Jaehong Lee Dec 16 '19 at 03:39
  • 1
    To get GBs you need to read large sequential files. Is the file you are reading a hundred MB or greater. Also parallel reading will not help you and wont work. – drescherjm Dec 16 '19 at 03:45

2 Answers2

5

No matter how much you tell fstream to read, it is likely to get done out of a fixed size streambuf buffer. The C++ standard does not specify its default size, but 4kb is fairly common. So passing a 4mb size to read() will very likely end up effectively reducing this to 1024 calls to read 4kb of data. This likely explains your observed performance. You're not reading a large chunk of data at once, but your application makes many calls to read smaller chunks of data.

The C++ standard does provide the means for resizing the size of the internal stream buffer, via the pubsetbuf method, and leaves it to each C++ implementation to specify exactly when and how to configure a stream buffer with a non-default size. Your C++ implementation may allow you to resize the stream buffer only before opening your std::ifstream, or it may not allow you to resize a std::ifstream's default stream buffer size at all; instead you must construct your custom stream buffer instance first, and then use rdbuf() to attach it to the std::ifstream. Consult your C++ library's documentation for more information.

Or, you may wish to consider using your operating system's native file input/output system calls, and bypass the stream buffer library altogether, which does add some overhead, too. It's likely that the contents of the file first get read into the stream buffer, then copied into your buffer you're passing here. Calling your native file input system calls will eliminate this redundant copy, squeeze a little bit more performance.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
3

You are probably asking for trouble in trying to parallelize read() calls on an istream object (which is, essentially, a serial mechanism).

From cppreference for istream::read (bolding mine):

Modifies the elements in the array pointed to by s and the stream object. Concurrent access to the same stream object may cause data races, except for the standard stream object cin when this is synchronized with stdio (in this case, no data races are initiated, although no guarantees are given on the order in which extracted characters are attributed to threads).

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83