0

I am new programmer in general and I have start working now with c. I am trying to decode the IDEv3 mp3 tag and I came across with a variety of problems. While I was using the fread() and strncpy() commands I have noticed that both need to have the \n character as the end reference point. (Maybe I am wrong this is only an observation)

When I am printing the output they produce a non readable character. As a solution to overcome the problem I am using fread() for 4 Bytes instead of 3 in order to produce (8)\n characters (whole Byte), and a second step I am using strncpy() with 3 Bytes to an allocated memory which then I am using for printing. In theory when I am using fread() I should not encounter this problem.

A sample of code:

#include <stdio.h>
#include <stdlib.h>

typedef struct{
  unsigned char header_id[3]; /* Unsigned character 3 Bytes (24 bits) */
}mp3_Header;

int main (int argc, char *argv[]) {

mp3_Header first;
unsigned char memory[4];

FILE *file = fopen( name.mp3 , "rb" );

if ( (size_t) fread( (void *) memory , (size_t) 4 , (size_t) 1 , (FILE *) file) !=1 ) {
  printf("Could not read the file\n");
  exit (0);
} /* End of if condition */

strncpy( (char *) first.header_id , (char *) memory , (size_t) 3);

printf ("This is the header_ID: %s\n", first.header_id);

fclose(file);

} /* End of main */
return 0;
Thanos
  • 1,618
  • 4
  • 28
  • 49
  • Please do something about the formatting – Ed Heal Dec 29 '13 at 23:29
  • So what's the problem with the null character? I don't understand. –  Dec 29 '13 at 23:32
  • @H2CO3 When I am using the: `(size_t) fread( (void *) memory , (size_t) 3 , (size_t) 1 , (FILE *) file)` It produces a funny character, is this suppose to happen? – Thanos Dec 29 '13 at 23:34
  • You don't show where you defined or declared `memory`, which is a bad sign and may well indicate some of the trouble. You should check the result of the `fopen()` call. You should not then cast the `FILE *` that `file` holds in the call to `fread()`. In fact, you should generally aim to code without any casts, though you won't achieve that goal. Casting every possible argument is not a sign of good programming. I also note that the return statement should be inside `main()`, not outside it. – Jonathan Leffler Dec 29 '13 at 23:42
  • @JonathanLeffler I was using these cast based on the man page of the linux. But I think it is not a good thing forcing the program to follow the casts. I will modify my code based on your suggestions. Thank you for your time and effort viewing my code. – Thanos Dec 30 '13 at 04:13

4 Answers4

5

Your observation with '\n' terminating strings isn't correct. Strings, in C, need to be terminated by a 0 byte (\0). However, some functions like fgets(), which are supposed to read lines from a file, take the \n at the end of the line as a terminator.

The problem with your code is that fread() ready binary data, and doesn't try to interpret that data as a string, which means it won't put the \0 at the end. But string functions, like strcpy, need this 0 byte to recognize the end of the string. strncpy stops after copying the \0 as well, but it won't ever put more bytes into the receiving string to prevent a buffer overflow. So it will copy your 3 bytes, but it won't put a \0 to the end of the string, as it would do if the string was shorter than the length argument.

So what you should do is declare header_id with one MORE element that what you actually need, and after the strcpy, set this extra element to \0. Like this:

strncpy( first.header_id , memory , 3);
first.header_id[3] = '\0';

Remember the 3 header bytes will go to array elements 0..2, so element 3 needs the terminator. Of course, you need to declare header_id[4] to have space for the extra \0.

Also note i omitted the type casts - you don't need them if your types are correct anyway. Passing an array to a function will pass a pointer to the 1st element anyway, so there's no need to cast the array header_id to a pointer in strncpy( (char *) first.header_id , (char *) memory , (size_t) 3);.

Guntram Blohm
  • 9,667
  • 2
  • 24
  • 31
  • I can understand where the mistake was coming from, nice analytical explanation. Thank you for your time and effort. – Thanos Dec 29 '13 at 23:41
  • Finally I solved my problem as you suggested I added another unsigned char with size[4] in order to take the '\0' character automatically. I was wondering if I would be able to do it manually. I mean for example that I have a first.header[4] = first.header_id[3] + '\0'; If this process is correct, how can I print all characters to see also the '\0' character. Thanks in advance for your time and effort. – Thanos Dec 30 '13 at 04:30
  • You can't take the \0 automatically, and you don't need another variable. Just remember that, when you're dealing with strings, you have to allocate one more byte than the maximal length of the string, and if you use a non-string-function (like fread) to fill the string, you have to set the last byte to \0 manually. To see all the bytes, you could print them in hex: `for (i=0; i – Guntram Blohm Dec 30 '13 at 06:46
  • Sorry the continuous questions but since I am new to this area I am still experimenting and full of queries. How can I add the `'\0'` sign automatically to the last byte? I am using `fread()` because I am reading a binary file. Currently I am using `first.header[4] = first.header_id[3] + '\0';` and it seems to work fine. The best solution so far to my problem is to use `fread()` by reading 4 Bytes, as a second step `strncpy()` 3 Bytes to char[4] and add manually the `'\0'` character on the 4th Byte. – Thanos Dec 30 '13 at 15:21
  • You can NOT append the \0 automatically if you use fread. Fread is for binary data. Only string functions will append the \0. Second, there is no "append to a string" operator in C. Strings are arrays of `char`. Your `first.header[4] = first.header_id[3] + '\0';` takes the 4th element from this array (remember indexes start at 0 so [3] gets the 4th element), add 0 to the ascii code (which doesn't do anything), and put the result into the 5th element ([4]). Also see my 2nd answer, it's easier to write code in an answer than in a comment. – Guntram Blohm Dec 30 '13 at 15:42
  • It looks more clear to me now. At first I thought that I understand but I was wrong. Thank you again for your time and effort it is very clear now what I have to do. – Thanos Dec 30 '13 at 22:34
2

Yes, C strings always end in the null (0x00) character. It's the programmer's responsibility to understand that, and code appropriately.

For example, if your header_id will be up to a 3-printable-character string, you need to allocate 4 characters in that array to allow for the trailing null. (And you need to make sure that null is actually present.) Otherwise, printf won't know when to stop, and will keep printing until it finds a 0 byte.

keshlam
  • 7,931
  • 2
  • 19
  • 33
  • I was wondering about how the string is terminating. It makes more sense now. Thank you for your time and effort. – Thanos Dec 29 '13 at 23:43
2

When you copy binary data between buffers you should use appropriate function for the job, like memcpy(). Because you are dealing with binary data you must know exactly the length of the buffer as there is no null characters to indicate the end of data.

To make it a string simply allocate length+1 buffer and set the last byte to '\0' and voila, you have a string. However.. it is possible that there is already a null character in the binary data you copied so you should do some sanity checks before trusting it to really be a string you wanted. Something like \001 might be invalid id for mp3 format.. but it might be a broken file, you never know what you are dealing with.

cen
  • 2,873
  • 3
  • 31
  • 56
2

There are 2 correct ways of handling the header. I'm assuming the MP3 file has a IDV3 tag, so the file starts with "TAG" or "TAG+". So the part you want to read has 4 bytes.

a) You think of char *memory being a C "string", and first.header_id as well. Then do it this way (omitted everything else to show the important parts):

typedef struct{
  unsigned char header_id[5];
} mp3_Header;
char memory[5];

fread(memory, 4, 1, file);
memory[4]='\0';
strncpy(first.header_id, memory, 5)

After the fread, your memory looks like this:

   0    1    2    3    4
+----+----+----+----+----+
|  T |  A |  G |  + |  ? |
+----+----+----+----+----+

The 5th byte, at index 4, is not defined, because you read only 4 bytes. If you use a string function on this string (for example printf("%s\n", memory)); the function doesn't know where to stop, because there is no terminating \0, and printf will continue to output garbage until the next \0 it finds somewhere in your computer's RAM. That's why you do memory[4]='\0' next so it looks like this:

   0    1    2    3    4
+----+----+----+----+----+
|  T |  A |  G |  + | \0 |
+----+----+----+----+----+

Now, you can use strncpy to copy these 5 bytes to first.header_id. Note you need to copy 5 bytes, not just 4, you want the \0 copied as well.

(In this case, you could use strcpy (without n) as well - it stops at the first \0 it encounters. But these days, to prevent buffer overflows, people seem to agree on not using strcpy at all; instead, always use strncpy and explicitly state the length of the receiving string).

b) You treat memory as binary data, copy the binary data to the header, and then turn the binary data into a string:

typedef struct{
  unsigned char header_id[5];
} mp3_Header;
char memory[4];

fread(memory, 4, 1, file);
memcpy(first.header_id, memory, 4)
first.header_id[4]='\0';

In this case, there is never a \0 at the end of memory. So it's sufficient to use a 4-byte-array now. In this case (copying binary data), you don't use strcpy, you use memcpy instead. This copies just the 4 bytes. But now, first.header_id has no end marker, so you have to assign it explicitly. Try drawing images like i did above if it isn't 100% clear to you.

But always remember: if you use operators like '+', you do NOT work upon the string. You work on the single characters. The only way, in C, to work on a string as a whole, is using the str* functions.

Guntram Blohm
  • 9,667
  • 2
  • 24
  • 31
  • It looks very clear and simple to understand based on your pictures. I fully understand my mistake, and I suppose the best solution is to use the `strncpy` and also append the `'\0'` sign at the end so the data will become string. Thank you for your time and effort. Some times simple things take more time than more complicated. – Thanos Dec 30 '13 at 22:32