3

I am running into problems interacting with a huge fixed length record data file. The file is over 14 GB in size. I first noticed a problem when I saw the return value from the System.Filesize() function was far less than the actual number of records in the huge file, given the number of bytes in the file and the length of each record. (System.Filesize returns the number of records in an untyped file given the record size specified during the Reset() call. It does not return the number of bytes in the file). I chalked it up to the return type of System.Filesize() being a Longint instead of an Int64.

I worked around the initial problem by calling GetFileSizeEx() and calculating the number of records myself. Unfortunately, BlockRead() also fails when trying to access records in the file whose offset is deep into the file. I'm guessing that again there are values being used that are overflowing somewhere in the code.

Is there a replacement module out for Delphi 6 there that can handle huge files and is a substitute for the System unit file I/O calls? I'm trying to avoid rolling my own if I can.

Community
  • 1
  • 1
Robert Oschler
  • 14,153
  • 18
  • 94
  • 227
  • 3
    14 gb ¡¡¡ you must consider use a database instead. – RRUZ May 18 '11 at 21:46
  • There are streams to access files too. But I don't remember if they support >2GB files. – CodesInChaos May 18 '11 at 21:52
  • Streams have supported large files for over a decade, @Code. See `TStream`, with its two pseudo-abstract implementations of `Seek`. – Rob Kennedy May 18 '11 at 22:03
  • I would be looking at either the architecture or the third-party provider that produces a 14GB file. Beyond a quick solution, the real issue here is that there are much better ways of storing/accessing data than using a 14GB file! – Misha May 18 '11 at 23:36

4 Answers4

6

You can use the GpHugeFile from Primoz Gabrijelcic. I used this library myself to access larger files (> 2gb) from Delphi 7. Anyway in your case you must consider try to change you app logic and migrate to a Database scheme which is much more efficient which a scheme based in record files.

Community
  • 1
  • 1
RRUZ
  • 134,889
  • 20
  • 356
  • 483
  • 2
    How is it more efficient? Without knowing the record structure, we can't know whether a database would be more space-efficient. And with fixed-size records, seeking and traversing the file won't be any less time-efficient than a database. – Rob Kennedy May 18 '11 at 22:03
  • 1
    @Rob, must exist a few cases where using a record file system is more efficient (for example with a small numbers of records) which using a database. but in most of cases (and in this 14 gb of data ¡¡¡¡) a RDBMS system is a better choice for performance. – RRUZ May 18 '11 at 22:09
  • 2
    I don't understand what's wrong with the built in stream classes that lead to you recommending this 3rd party code. – David Heffernan May 18 '11 at 22:24
  • @David, i never say which the standard `TStream` class has something wrong, but the `TGpHugeFile` works better (based on my experience) than the `TStream` which **very large files**. – RRUZ May 18 '11 at 22:32
  • 1
    @David: You obviously don't know Primoz (@gabr here) and his code. [OmniXML](http://omnixml.com) and [OmniThreadLibrary](http://otl.17slon.com/) are both extremely good (and free) replacements for Delphi's own built-in functionality, as is GpHugeFile (and several other units that Primoz has kindly made available). – Ken White May 18 '11 at 23:36
  • 1
    @ken sure I know Primoz, but I still don't have an answer! – David Heffernan May 19 '11 at 03:49
  • TStringList in Delphi 6 just simply refuses to load huge files when LoadFromFile() is called. Doesn't throw an error either, just returns an Items Count of 0. – Robert Oschler May 19 '11 at 04:42
  • @Robert Why does this surprise you? How are you going to fit a huge file into 32 bit address space? – David Heffernan May 19 '11 at 14:09
  • I'm not surprised that it can't do it, just that it fails silently without an error. – Robert Oschler May 20 '11 at 00:46
2

Try TGpHugeFile.

Ondrej Kelle
  • 36,941
  • 2
  • 65
  • 128
2

It turns out that the internal seek routine used by the System unit also had problems due to the use of low capacity numeric types. I coded up my own call to the Windows SetFilePointerEx() function and all is well. I have provided the source code below in case it might help others. I have included the code I created to get the number of records properly too since you will need both. Everything else works the same.

// Some constants
const
    kernel = 'kernel32.dll';


function SetFilePointerEx(hFile: Integer; distanceToMove: Int64; var newFilePointer: Int64; moveMethod: DWORD): boolean; stdcall; external kernel name 'SetFilePointerEx';


// easyGetFileSize() is a replacement filesize function.  Use it to get the number of bytes in the huge file.  To get the number of records just "div" it by the record size.

function GetFileSizeEx(hFile: THandle; var FileSize: Int64): BOOL; stdcall; external 'kernel32.dll' name 'GetFileSizeEx';


function easyGetFileSize(theFileHandle: THandle): Int64;
begin
    if not GetFileSizeEx(theFileHandle, Result) then
        RaiseLastOSError;
end;

// ---- Replacement seek function.  Use this instead.

procedure mySeek(var f: File; recordSize, recNum: Int64);

var
    offsetInBytes, numBytesRead: Int64;
    pBigInt: ^Int64;
begin
    offsetInBytes := recNum * recordSize;

    pBigInt := nil; // Not interested in receiving a new pointer after seek.

    // Call the Windows seek call since Delphi 6 has problems with huge files.
    if not SetFilePointerEx(TFileRec(f).Handle, offsetInBytes, pBigInt^, FILE_BEGIN) then
        raise Exception.Create(
            '(mySeek) Seek to record number # '
            + IntToStr(recNum)
            + ' failed');
end;
Robert Oschler
  • 14,153
  • 18
  • 94
  • 227
  • 2
    This is just DELAYING the inevitable. You should stop using Pascal native I/O and start using TFileStream. – Warren P May 19 '11 at 13:53
1

You can't use Pascal I/O with huge files like this, not in any version of Delphi. Your best bet is to use a TFileStream which has no such limitations.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490