4

I am using the MS Visual Studio 2012 compiler and I am building in x64 release mode.

Using ifstream I can read files larger than 4GB. The problem is, I can't seek to a position in the middle of a 10GB file.

When I use seekg like this is.seekg (5368709120, is.beg); then is.tellg(); returns -1 which means the seek failed. I am sure that the file exists and the position 5368709120 exists too. It works perfectly fine if I use: is.seekg (100, is.beg); for example.

Using multiple seeks is not an option since the files can get up to 300GB (and using many seeks will be slow).

My question is: how can I get seek to work correctly on a 10GB file without using multiple seeks?

zETO
  • 191
  • 7
  • See the answer here http://stackoverflow.com/questions/9405712/how-can-i-seekg-files-over-4gb-on-windows as it might be same what you're trying to do. – anurag-jain Aug 18 '15 at 18:33
  • Thanks for the comments guys, but none of the links has a solution to my problem. – zETO Aug 18 '15 at 18:36
  • I see two suggested alternative Q&A; neither is very satisfactory as an answer. – Jonathan Leffler Aug 18 '15 at 18:36
  • The last comment on the question http://stackoverflow.com/questions/32057750/how-to-get-the-filesize-for-large-files-in-c asserts that there is a bug in Microsoft's implementation of some of the stream functions, using 32 bit versions when they should be using 64 bit versions. – Mark Ransom Aug 18 '15 at 18:41
  • Have you tried using `5368709120ULL` to make sure the constant isn't being truncated? – Mark Ransom Aug 18 '15 at 18:44
  • @ Mark Ransom: Thanks for the reply. I tried adding `ULL` but it doesn't work. Seems like a bug, like you said. Do you know more on which visual studio versions are affected? – zETO Aug 18 '15 at 18:47
  • You can find out if your version is affected by tracing into the call with the debugger. – Mark Ransom Aug 18 '15 at 18:59

1 Answers1

6

how can I get seek to work correctly on a 10GB file without using multiple seeks?

Forgetting for a second the rest of your post, the answer to this question (in Windows) is very simple: use _fseeki64. I don't see a problem with dropping down to a lower level API when dealing with huge files -- you'd most likely be doing large chunk read/writes anyway right? You can easily use fread and fwrite for that.

If you insist on STL, Microsoft's implementation won't work. I've heard STLPort handles large file seeking, so you could go for that. It's a rather heavy handed approach though, I'd stick with the basic fseek.

Blindy
  • 65,249
  • 10
  • 91
  • 131
  • Thanks a lot for the answer! I will use `_fseeki64` and `_ftelli64` to seek the file and get the current position there. Is there any way to read or write more than 4GB at one time using 'fread' and 'fwrite'? – zETO Aug 18 '15 at 20:01
  • It doesn't matter honestly, UDMA isn't made for that kind of access. Even if you store entire gigs of data in memory (and you shouldn't), you'll use small buffers to pipe data to disk. 32-64MB works best these days. – Blindy Aug 18 '15 at 21:43
  • If you do have the entire buffer in memory, don't make a smaller buffer and copy data into it to `fwrite`, send `fwrite` calls from the large buffer with offsets and desired operation length. However if you indeed are working with large files, you should be using memory-mapped files anyway and plain memory block copy. That will give you the best performance. – Blindy Aug 18 '15 at 21:45