0

I am trying to parse these 30 bytes:

00000000: 4353 4333 3630 4653 0200 0000 1900 0000
00000010: 0002 0000 0032 0000 0035 0000

Into a struct (also 30 bytes):

struct superblock_t {
uint8_t fs_id [8];
uint16_t block_size;
uint32_t file_system_block_count;
uint32_t fat_start_block;
uint32_t fat_block_count;
uint32_t root_dir_start_block;
uint32_t root_dir_block_count;
} PACKED;

I am first reading the 30 bytes into a buffer then using memcpy() to copy desired memory.

unsigned char buf[30]; //should this buffer be size_t 31?
read(file_descriptor, buf, 30);

struct superblock_t super_block; //initilize a super_block
memcpy(super_block.fs_id, buf, 8);
memcpy(super_block.block_size, buf + 8, 2);
memcpy(super_block.file_system_block_count, buf + 10, 4);
// and so on for additional attributes.

I am then getting a segfault error :(

Am I misusing the memcpy() function? Sorry, I'm coming from c++ for reference.

  • 1
    Please provide a [mre], including the compiler warnings on strict level which you get. – Yunnosch Mar 24 '22 at 06:45
  • Looks in some case like you did not read the required parameter data types in the spec. Did you e.g. study https://en.cppreference.com/w/c/string/byte/memcpy ? – Yunnosch Mar 24 '22 at 06:47
  • You probably want `memcpy(super_block.fs_id, buf, 8);` -> `memcpy(&super_block.fs_id, buf, 8);` etc. Read the documentation of `memcpy` closely. `super_block.block_size` is an `uint16_t`, but `memcpy` wants a _pointer_. – Jabberwocky Mar 24 '22 at 06:54
  • @Jabberwocky is that where you would want to use -> syntax for attribute data location? – Chris Kouts Mar 24 '22 at 06:56
  • 2
    @ChrisKouts no, not at all. Coming from C++ you should know that. `super_block` is not a pointer. – Jabberwocky Mar 24 '22 at 06:56
  • Just to determine your level of understanding. Do you know the meaning and difference of `int` vs. `int*` and `MyInt` vs. `&MyInt`, i.e. the address-related operators `*` and `&`? The latter, admittedly might be much simpler (not harder) than you are used to from C++. Do you know that the reference meaning of `&` of C++ does not exist in C? – Yunnosch Mar 24 '22 at 07:00
  • Apologies, I am aware of the difference. I had forgotten the `&` operator existed this will now be a breeze. Thanks for the assistance! :) – Chris Kouts Mar 24 '22 at 07:03
  • 1
    @Yunnosch to validate my understanding when I was using `memcpy()` on `fs_id` the `&` was not needed as its already an array? – Chris Kouts Mar 24 '22 at 07:05
  • Yes, indeed. True. – Yunnosch Mar 24 '22 at 07:06
  • 1
    Out of scope: Did you consider potentially different endianess? – Yunnosch Mar 24 '22 at 07:07
  • Yes, the super Block ordering is standardized I believe. – Chris Kouts Mar 24 '22 at 07:11

2 Answers2

2
unsigned char buf[30]; //should this buffer be size_t 31?

You can't infer 30 or 31 bytes accurately. In your case, the member fields of the struct The compiler is free (and will) apply padding bytes between members of the struct and/or at the end of the struct.

Given that uint32_t file_system_block_count is preceeded by 10 bytes (not a number divisible by 4), you can't assume that the address of this field is exactly 10 bytes offset from the start of the struct instance. That's a very likely place for padding to exist.

You could to this:

unsigned char buf[sizeof(struct superblock_t)];
read(file_descriptor, buf, sizeof(struct superblock_t));

And that works so long as the file was written with code generated by the same compiler and compiler flags such that the same padding bytes are applied to both the reading and writing code. Another program saving that file, which was compiled differently, might have a different size for superblock. (We could also discuss difference of endianness of integers, but I digress).

A better solution is to always serialize the reading and writing of your struct on a member by member basis

struct superblock_t block = {0};
read(file_descriptor, &block.fs_id, sizeof(block.fs_id));
read(file_descriptor, &block.block_size, sizeof(block.block_size));
read(file_descriptor, &block.file_system_block_count, sizeof(block.file_system_block_count));
...

The above implies you wrote the members out to file using a similar code path that issued individual write calls for each field.

selbie
  • 100,020
  • 15
  • 103
  • 173
  • This solution is a work of art, thanks for the tips! :) – Chris Kouts Mar 24 '22 at 07:07
  • Isn't the solution impacted by endieness? – TruthSeeker Mar 24 '22 at 07:19
  • @TruthSeeker - I did drop a hint about [endianess](https://en.wikipedia.org/wiki/Endianness) in my answer. I didn't want to overburden the op with too much information. But I do recommend to the OP that they checkout the hton ntoh functions here: https://linux.die.net/man/3/htonl so they can achieve perfect file interop between big endian and little endian systems. – selbie Mar 24 '22 at 18:27
0

The prototype (found e.g. here https://en.cppreference.com/w/c/string/byte/memcpy ) is

void* memcpy( void *dest, const void *src, size_t count );

It wants pointers for destination.
In more than one case you provide integers, writing at runtime to whatever the compiler understands there does not surprise me by causing a segfault.
And the warnings you quote say the same.

So the answer to your question is yes.
You are misusing the memcpy function.

You need to provide pointers instead of copies.
E.g.:

memcpy(&(super_block.block_size), buf + 8, 2); // the generous () for clarity

(Not discussing the assumption-based 2 here. Though I think there are better ways.)

Yunnosch
  • 26,130
  • 9
  • 42
  • 54
  • is the initialized structs attribute an invalid destination? – Chris Kouts Mar 24 '22 at 06:54
  • 1
    No. But in order to point them out as destination you need to point to them, not copy their value. Please compare the datatypes. Pointers to destination are needed, not copies of their values. – Yunnosch Mar 24 '22 at 06:55