4

Right now I am using istream to read in data. I have both text which I would like to read in as strings and numbers, and hashes which are read into character arrays. Since hashes are effectively random, I am hitting snags when I try to read it back in and hit EOF (which is part of the hash). Is there a way to accomplish this without resorting to fread. Also, is there a to use both istream and fread some I don't have to parse the integers and strings by hand. Finally, what is the best way to use fgets to get a string of unknown length.

Thanks, Eric

EDIT: Here is the code:

string dummy;
ifstream in(fileName);
for(int i=0; i<numVals; i++)
{
    int hashLen;
    in>>hashLen;
    char cc;
    in.get(cc);//Get the space in between
    cout<<"Got first byte: "<<(int)cc<<endl;

    char * hashChars = new char[hashLen];
    in.read(hashChars, hashLen);
    for(int j =0; j <hashLen; j++)
    {
        char c = hashChars[j];
       unsigned char cc = reinterpret_cast<unsigned char&>(c);
        cout<<"Got byte: "<<(int)c<<(int)cc<<endl;
        if(in.fail())
        {
            cout<<"Failed! "<<in.eof()<<" "<<in.bad()<<endl;
        }
    }

delete hashChars;

    getline(in,dummy);//get a dummy line
    cout<<"Dummy: "<<dummy<<" numvals: "<<numVals<<" i: "<<i<<" hashLength: "<<hashLen<<endl;
}

My output looks like:

1>Got first byte: 32

1>Got byte: 4 4

1>Got byte: -14 242

1>Got byte: 108 108

1>Got byte: 87 87

1>Got byte: 113 113

1>Got byte: -116 140

1>Got byte: -106 150

1>Got byte: -35 221

1>Got byte: 0 0

1>Got byte: -91 165

1>Got byte: 39 39

1>Got byte: 111 111

1>Got byte: 7 7

1>Got byte: 126 126

1>Got byte: 16 16

1>Got byte: -42 214

1>Dummy: numvals: 35 i: 12 hashLength: 16

1>Got first byte: 32

1>Got byte: 14 14

1>Failed! 1 0

1>Got byte: -65 191

1>Failed! 1 0

1>Got byte: -107 149

1>Failed! 1 0

1>Got byte: -44 212

1>Failed! 1 0

1>Got byte: -60 196

1>Failed! 1 0

1>Got byte: -51 205

1>Failed! 1 0

1>Got byte: -51 205

1>Failed! 1 0

Eric Kulcyk
  • 279
  • 3
  • 15
  • A sample of your data and the code you're using to read it would go a *long* way in getting answers to what is going wrong. And regarding parsing integers and strings by hand, formatted extractors with `istream`s are the cats pajamas. – WhozCraig Aug 16 '13 at 23:56
  • EOF cannot be part of the hash, because EOF is a concept that cannot be embedded inside of a file. However you are reading in data, that's wrong. – Mooing Duck Aug 17 '13 at 00:04
  • Yes streams can read binary data. Streams come with functions for parsing integers and strings of varying lengths. Do not use fgets, fread, or other c functions in C++ please. – Mooing Duck Aug 17 '13 at 00:06
  • That code doesn't match that output – Mooing Duck Aug 17 '13 at 00:11
  • The 1> was added by a different part of the code. Also, I realize I'm missing a couple of lines. – Eric Kulcyk Aug 17 '13 at 00:13
  • You talk about problems your having, but your code doesn't seem to have any of those problems. Please explain the exact problems you're encountering. You say you want to read a string of unknown length. How will you know how much to read? Will you first read in the length? Does it stop at a newline or space? – Mooing Duck Aug 17 '13 at 00:13
  • It's hitting eof(). It shouldn't be. There is still more to read. However, it continuously reads "205" as you can see are the last two numbers it prints out. – Eric Kulcyk Aug 17 '13 at 00:17
  • What I mean when I say "It shouldn't be" is that is not the end of the file. I can open it with notepad and see that. – Eric Kulcyk Aug 17 '13 at 00:19
  • You read the data before enter in the for loop, why you call the in.fail inside the for loop if you are not reading data inside it. – rbelli Aug 17 '13 at 00:23
  • You should have `if (in.read(...)) { ... rest of code in here ... }`. – DanielKO Aug 17 '13 at 00:23
  • I just put the fail check in there to print out what was happening. I don't expect to need it because the file is a fixed format and I know the data will be there. However, when I started to read all "205"s after a certain point, I put the check in to see what was wrong. – Eric Kulcyk Aug 17 '13 at 00:31

1 Answers1

2

When reading binary data you generally want to open your std::ifstream with the flag std::ios_base::binary. The resulting differences are fairly small but they generally do matter.

There are a few oddities in your code you might want to fix:

  • You always need to check after reading if the operation was successful, e.g., using if (in.read(hashChars, hashLen)) { ... }
  • There is no need to use reinterpret_cast<...>() which always has implementation defined semantics. You should use static_cast<unsigned char>(c) instead.
  • You allocate an array of characters but you release it using delete p. You need to use delete[] p instead. Using delete p results in undefined behavior. There isn't really any need to use new and delete at all, though, as std::vector<char> hashChars(hashLen) does automatic memory management.

There are a few [mutilated] questions otherwise embedded in the request above (so the questions/answers are guesses of what is being asked):

  • Can you mix std::istream::read() and fread() on the same stream (I suppose this is the question): not immediately unless the stream happens to be std::cin which reads from the same source as stdin. If you want to use both std::istream::read() and fread() on the same file you'll need to wrap a FILE* by a suitable std::streambuf and initialize an std::istream with the corresponding object.
  • How to read an arbitrary sized line with fgets()? You can't. The buffer to fgets() gets allocated before attempting to fill it and can always be filled before reaching a newline. You can use std::getline() to read an arbitrary long line, however. If you just want to skip the line, you can use in.ignore(std::numeric_limits<std::streamsize>::max(), '\n') when using std::istream. Off-hand I don't know if there is a similar operation for FILE*.
Thanatos
  • 42,585
  • 14
  • 91
  • 146
Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • Thanks for those suggestions, I forgot about the binary flag, which appears to make sure that ^Z (the Windows eof character) is ignored. Now it reads the entire file. – Eric Kulcyk Aug 17 '13 at 01:11