2

Here is an excerpt of the code I'm trying now in relation to my previous problems in cs50 'Recover'.

I've tried everything with fread, but for some reason it simply isn't working the way I want it to.

In this loop, I'm trying to figure out how fread actually works.

My question with fread is - does it read 512 bytes of data 1 at a time from file pointed to (rawdata) each time it's called? In that case, my code should work since the loop is running indefinitely, calling the function over and over and hence moving the stream position/file cursor (I don't know what its called) 512 bytes at a time. I have a break from the loop with feof(rawdata).

I'm using this small program to aid me with Recover from cs50 pset4.

    // In a loop, until the end of file has been reached,
    while (true) {
        // Zeroing counter
        jpg_counter = 0;
        // Reading 512 bytes into the array of bytes
        fread(bytes, 512, 1, rawdata);
        
        // Searching the 512 bytes for JPEG signatures - bytes[3] with bitwise AND
        if (bytes[0] == 0xff && bytes[1] == 0xd8 && bytes[2] == 0xff && (bytes[3] & 0xf0) == 0xe0) {
            jpg_counter++;
        }
        
        // If found JPG, add total and keep going till end of file
        if (jpg_counter != 0) {
            total++;
        }
 
        //feof returns a non zero value only at the end of a file
        if (feof(rawdata) != 0) {
            break;
        }
    }
    printf("%i\n", total);
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Please note, 'bytes' is my previously declared array of bytes of size 512 – thefailagent Aug 27 '20 at 08:32
  • 3
    you should call `fread(bytes, 1, 512, rawdata)` and check the returned value, see document of this function here http://www.cplusplus.com/reference/cstdio/fread/ – Nghia Bui Aug 27 '20 at 08:37
  • Hi, I've tried this. For some reason, I got a big fat 0 returned by fread for both fread(bytes, 1, 512, rawdata) and fread(bytes, 512, 1, rawdata), even though perror(rawdata) returned success and it definitely isn't the end of file. – thefailagent Aug 27 '20 at 08:39
  • @thefailagent is your file big enough? – th33lf Aug 27 '20 at 08:40
  • 2
    Yes the file is much bigger (exact size unknown) – thefailagent Aug 27 '20 at 08:45
  • You're only checking for the JPEG signature if it starts at an offset divisible by 512. I assume you need to find it at any offset, including the middle of the buffer, or even spanning two different 512-byte buffers. – interjay Aug 27 '20 at 08:47
  • @thefailagent You can't do `perror(rawdata)`. Is it `ferror(rawdata)` you're looking for? – stderr Aug 27 '20 at 08:49
  • Oh I didnt know that existed. I will try it now – thefailagent Aug 27 '20 at 08:57

2 Answers2

1

There are 2 major problems with your approach:

  • you do not correctly test for end of file in the reading loop. You should use the return value of fread to determine if you have read at least one byte from the stream, and you should pass the arguments in the opposite order: you are reading as many as 512 bytes:

      size_t n;
      while ((n = fread(bytes, 1, 512, rawdata)) > 0) {
          // n bytes were read from the file.
    
  • you only test the first 4 bytes for a JPEG signature. Unless the file is known to have a very specific structure, you probably want to search the block for signatures.

  • care must be taken to handle signatures overlapping blocks, ie: that start at the end of a 512 byte block but end on the next block.

Here is a modified version:

int count_jpeg(FILE *rawdata) {
    unsigned char bytes[3 + 512];
    int pos = 0, end, i;
    int total = 0;
    size_t nread;
    // In a loop, until the end of file has been reached,
    while ((nread = fread(bytes + pos, 1, 512, rawdata)) > 0) {
        end = pos + nread - 3;
        for (i = 0; i < end; i++) {
            // Searching the block for JPEG signatures - bytes[3] with bitwise AND
            if (bytes[i] == 0xff && bytes[i + 1] == 0xd8
            &&  bytes[i + 2] == 0xff && (bytes[i + 3] & 0xf0) == 0xe0) {
                total++;
                i += 3;
            }
        }
        // copy the last 3 bytes to the beginning of the block
        for (i = pos = 0; i < 3; i++)
            if (end + i >= 0) {
                bytes[pos++] = bytes[end + i];
        }
    }
    printf("%i\n", total);
    return total;
}

EDIT: if the JPEG signatures are known to be aligned on 512-byte boundaries, the scanning loop can be removed, but reading a partial block at the end of the file should still be possible:

int count_jpeg(FILE *rawdata) {
    unsigned char bytes[512];
    int total = 0;

    // In a loop, until the end of file has been reached,
    while (fread(bytes + pos, 1, 512, rawdata) > 4) {
        /* check for a signature at the beginning of the block */
        if (bytes[0] == 0xff && bytes[1] == 0xd8
        &&  bytes[2] == 0xff && (bytes[3] & 0xf0) == 0xe0) {
            total++;
        }
    }
    printf("%i\n", total);
    return total;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Thank you for your comment! I have used the first four bytes because the problem set in question explicitly stated that the JPEG signatures would be block aligned always to the start of a 512 byte block followed by slack space, if any. But I guess, it is better practice to search through the entire block anyway. I will try implement this loop and see what I can understand from it. Thanks again! – thefailagent Aug 31 '20 at 02:55
  • @thefailagent: I was not aware of this restriction that greatly simplifies the problem. – chqrlie Aug 31 '20 at 06:56
0

man page of fread states

The function fread() reads nmemb items of data, each size bytes long, from the stream pointed to by stream, storing them at the location given by ptr

so you can store number of elements read in a variable and check if 512 bytes are read.

int nmenb = fread(bytes, 512, 1, rawdata);

if( nmemb != 1 )
{
  //exit
}
else{
      // 512 bytes read successfully 
}
Samir Kape
  • 1,733
  • 1
  • 16
  • 19