0

I have a large plain text file with size of 20GB on disk. Lets name this file as "MyFile.txt". This file contains only english languge words and one string with value of "+++". Lets name this string as FlagString. From the beginig of MyFile.txt to FlagString there are correct english words. Lets call this section of file as Dictionary. But from first word after FlagString to the end of MyFile.txt the words can be with misspells. Lets call this section of file as CheckedSection. I must read each word from checked section and validate spelling of this word to compare it with appropriate word in Dictionary in compliance with some algorithm. If MyFile.txt is large I'd like to use CreateFileMapping and MapViewOfFile functions to map file into memory. My problem is the following:

  1. MyFile.txt is large so I want to map it fragmentary in memory. Let each fragment has size of 1GB. Please help me how to map file fragmentary in memory using CreateFileMapping.
  2. How can I identify FlagString as delimiter between Dictionary and CheckedSection when I use fragmentary file mapping into memory. FlagString in MyFile.txt may be not obligatory on 1GB section border but it may be inside 1GB section. Is there any file cursor that I use to mark position in file after file has been mapped in memory?
  3. Can I create two memory mappings from MyFile.txt? One mapping for Dictionary and another mapping for CheckedSection.
  4. Must I call UnmapViewOfFile and CloseHandle each time when I finished processing of current 1GB section of Dictionary or CheckedSection?
Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
user3769902
  • 415
  • 2
  • 5
  • 20
  • 1. You pass the desired size of the fragment to `CreateFileMapping` in `dwMaximumSizeLow`. You then call `MapViewOfFile`, passing the same size in `dwNumberOfBytesToMap` and offset from the beginning of the file in `dwFileOffsetHigh` and `dwFileOffsetLow`. – Igor Tandetnik Mar 24 '15 at 23:29
  • 2. `Is there any file cursor that I use to mark position in file after file has been mapped in memory?` I'm not sure I understand the question. You know the offset of each fragment from the beginning of file (you passed it to `MapViewOfFile`). You know the offset of each byte from the beginning fo the mapped fragment. In what way is this insufficient to track your position within the file? – Igor Tandetnik Mar 24 '15 at 23:31
  • 3. Yes, you can call `MapViewOfFile` more than once on the same file mapping, to create multiple views of the same file. – Igor Tandetnik Mar 24 '15 at 23:33
  • 4. You generally want to call `UnmapViewOfFile` whenever you are done with the fragment (if you don't, you'd run out of address space pretty quickly). You would only want to call `CloseHandle` when you are completely done with the file. – Igor Tandetnik Mar 24 '15 at 23:34

0 Answers0