1

I'm trying to compile and run the open source code for Alien vs Predator (2000). See https://app.assembla.com/spaces/avp_mod/git/source. The level map data is stored in .RIF files, which are compressed. The first thing on each level load is to read in the data from the file before de-compressing it. In VS 2017, reading the file truncates after a certain number of character reads because it starts pulling in negative character codes (-44, for example). This is using

std::ifstream infile;
inFile.open(file_name, std::ios::in | std::ios::binary | std::ios::ate);

Not certain it matters, but the last character it successfully pulls in is a null (\0). After that, all values are negative. Any idea how to read this file correctly? I can provide more information if needed.

I've also tried reading each character in one at a time, which is how I determined that the negatives were being pulled in.

Update: original code referenced in my comment below. On last line, "buffer" is filled with characters up to the point where the negative values begin to come in. This code, I assume, worked correctly as written in the original compiler (VS2010, I believe).

unsigned long bytes_read;
char * buffer;
char * buffer_ptr;
char id_buffer[9];
HANDLE rif_file = CreateFileA(file_name, GENERIC_READ, 0, 0, OPEN_EXISTING, FILE_FLAG_RANDOM_ACCESS, 0);
DWORD file_size = GetFileSize (rif_file, NULL);
if (!ReadFile(rif_file, id_buffer, 8, &bytes_read, 0)) {
CloseHandle(rif_file);
return 0;
}
buffer = new char[file_size];
if (!ReadFile(rif_file, buffer + 8, (file_size - 8), &bytes_read, 0))

Update 2: Link to one of the rif files: https://drive.google.com/open?id=18BJR_6CkeHPU25u1DY6RGQQVWmR-kGdP

Update 3: my test code const char * file_name = "E3demoSP.RIF";
std::ifstream inFile;
size_t size = 0;
inFile.open(file_name, std::ios::in | std::ios::binary);
char* oData = 0;
char ch;
inFile.seekg(0, std::ios::end);
size = inFile.tellg();
std::cout << "Size of file: " << size;
inFile.seekg(0, std::ios::beg);
oData = new char[size + 1];
int counter = 0;
while (inFile >> std::noskipws >> ch) {
oData[counter] = ch;
counter++;
}
return 0;

  • 3
    Why `ate` ('at end')? Wouldn't you want to read the file from beginning to end? Did you check stream state after input operations? With `ate`, you might be trying to read past the end of the file and just getting garbage values. But that's just guessing around, without a [mcve] it's impossible to give precise advice... – Aconcagua Aug 05 '19 at 22:51
  • Possible duplicate of [Negative ASCII value](https://stackoverflow.com/questions/4690415/negative-ascii-value) – Steve Aug 05 '19 at 23:14
  • I'm not certain the "ate" is relevant. See the original code example I appended to my original question. The behavior manifests in the original code I pulled from the repository, which was originally compiled in VS2010, I believe, which may or may not be relevant, though I suspect that it might be. – robert rice Aug 06 '19 at 00:16
  • How many bytes are read, according to the value stored in `bytes_read`? – JaMiT Aug 06 '19 at 02:13
  • bytes_read = 639008 after the last statement. But buffer contains only "ÍÍÍÍÍÍÍÍì¾\t". – robert rice Aug 06 '19 at 12:21

2 Answers2

1

Let's take a look at the format of the file you're reading:

$ od -x -a E3demoSP.rif  | head

0000000    4552    4342    4952    3146    beec    0009    bf20    0011
          R   E   B   C   R   I   F   1   l   >  ht nul  sp   ? dc1 nul
0000020    0001    0000    0000    0000    0000    0000    0001    0000
        soh nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul
0000040    0001    0000    0004    0000    0008    0000    0010    0000
        soh nul nul nul eot nul nul nul  bs nul nul nul dle nul nul nul
0000060    002a    0000    0060    0000    0058    0000    d7cb    cda5
          * nul nul nul   ` nul nul nul   X nul nul nul   K   W   %   M
0000100    a9af    dd8d    d1ed    c689    a281    879f    8fcf    93db
          /   )  cr   ]   m   Q  ht   F soh   "  us bel   O  si   [ dc3

Okay, looks like it starts with what looks like a "magic string" identifying the file format, REBCRIF1, which is then followed by binary data.

In fact, the first bit of code you quote above,

if (!ReadFile(rif_file, id_buffer, 8, &bytes_read, 0)) {  
 CloseHandle(rif_file);  
 return 0;  
}  

grabs this eight-byte identifier so that the subsequent read into the buffer won't get it.

Anything starting with an 8 or above here has the first bit set, so represents a negative value if your char type is a one-byte signed type. (Yes, it's possible for it to represent something else per the standard, but if you're not on some weird embedded system or really old PC I can basically guarantee that char is a one-byte signed integer type.)

The original code,

 buffer = new char[file_size];   
if (!ReadFile(rif_file, buffer + 8, (file_size - 8), &bytes_read, 0))
{
  //...
}

should read everything after REBCRIF1 into buffer. Is it not?

I suspect the issue has nothing to do with characters with their highest bit set, and everything to do with the fact that you're opening the file with the std::ios::ate flag set. However, without seeing your actual code there's not much else we can say.

Daniel McLaury
  • 4,047
  • 1
  • 15
  • 37
  • According to the "bytes read" count, it is reading everything into buffer, but the buffer variable itself does not reflect that. bytes_read = 639008 after the ReadFile statement. But buffer contains only "ÍÍÍÍÍÍÍÍì¾\t". – robert rice Aug 06 '19 at 12:24
  • What do you mean it contains "only" that? The buffer is as big as the file. It contains *something*. – Daniel McLaury Aug 06 '19 at 12:28
  • When I look at it in the "Locals" window, that is all that is displayed for the value of the variable. I copied that directly from the Locals window. – robert rice Aug 06 '19 at 12:45
  • I removed the std::ios:ate from my code. The resulting buffer still contains only "REBCRIF1ì¾\t". See original comment for my full test code. – robert rice Aug 06 '19 at 12:45
  • It contains more stuff; you're just trying to view it as a null-terminated string when it's not. Compare to the hex dump above: your debugger is just stopping at the first instance of 0x00. – Daniel McLaury Aug 06 '19 at 12:47
0

If you are reading into the char type, this is because char is signed, and can only takes values between -128 and 127.

#include <iostream>
#include <limits>

int main() {
    std::cout << "char min: " << (int)std::numeric_limits<char>::min() << std::endl;
    std::cout << "char max: " << (int)std::numeric_limits<char>::max() << std::endl;
    return 0;
}

Output:

char min: -128
char max: 127
Steve
  • 6,334
  • 4
  • 39
  • 67
  • This is likely the cause, but you should really post more code, as it's kind of a guess. But it's probably this. – Steve Aug 05 '19 at 23:13
  • `char` is not *necessarily* signed (standard leaves this to the implementation). Some compilers (e. g. GCC) even provide switches to chose explicitly if `char` is signed or unsigned... – Aconcagua Aug 05 '19 at 23:17
  • See my addendum to original question. – robert rice Aug 06 '19 at 00:16