4

in linux, is it normal that there is no null character at the end of file?

I made a empty file and open with mouse pad write az.

save it.

when I open the file up with hex editor, there is no null character but 0a is there.

what null character should I put the end of file?

when I write the file with system call.

is it 0a? or 0?

thanks

kim taeyun
  • 1,837
  • 2
  • 24
  • 49

4 Answers4

7

There is usually no null character at the end of files on Unix. An empty text file has zero bytes. One empty line will have a 0x0A (LF, linefeed) character. Unix text files have single LF line endings.

Keith
  • 42,110
  • 11
  • 57
  • 76
  • thank you for your replying. then how do I know if I reach the end of file? text file have LF line if there are at least one line but what about binary file? – kim taeyun Mar 28 '11 at 01:18
  • @kimtaeyun [`read(2)`](http://man7.org/linux/man-pages/man2/read.2.html) returns 0 when you're already at the end of a file. So `while(bytes = read(fd, buf, bufsize)) { do something with buf[0..bytes-1] }` is a common idiom for reading a file in bufsize chunks until EOF. – Peter Cordes Oct 09 '16 at 18:04
5

The filesystem records the number of bytes in a file, and all the bytes are free to have any value - no particular character/byte value is a reserved sentinel value meaning end-of-file. So, you can have a NUL anywhere in the file, but don't need one to mark the end.

Each line in a text file should indeed be terminated with a linefeed, ASCII 10 dec, 0A hex (on Windows it'd be a carriage return ASCII 13 dec followed by a linefeed). If you create an empty file ala echo > filename it will have one linefeed, but only because echo prints an empty line by default. If you instead used touch filename it would be completely empty.

When you cat > filename and type things into your terminal/console window, you eventually use Control-D to trigger an end-of-file condition (for Linux / Control-Z in DOS), but that character is not stored in the file itself.

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • thank you. could you tell me how to get the number of bytes in a file? how to access recorded value(about file size) by file system? – kim taeyun Mar 28 '11 at 01:27
  • @kim: Probably best portable way is to fseek() to end of file then ftell() file position. – Zan Lynx Mar 28 '11 at 01:32
  • 1
    @Kim, @Zan: `fseek`'s certainly a reasonable way. `stat` and `fstat` can also be very convenient, and give you other nice details like creation/modification/access timestamps and ownership info. One other useful thing is that it's safe to try to read past the end of the file - the `fread` and `read` functions will typically tell you how many bytes they read: less than you asked for (including 0) would indicate end of file. `fgets` reads one newline-delimited line of text (or as much of one as possible until end-of-file), and returns NULL when it can't read anything except end-of-file. – Tony Delroy Mar 28 '11 at 02:03
  • if I want to do with assembly, should I have to add all of return value of read system call? if I use kind of fseek() or ftell() isn't it same as using C? I want to understand how things work in assembly level. thanks – kim taeyun Mar 28 '11 at 02:43
  • @kim: from assembly language you can still call a C function - you just need to know the calling convention, which boils down to how you put the values that the function expects into CPU registers and the stack, get the function to execute, then receive the results. An easy way to show this is to write a trivial C program to call the function, then use gcc -S to get assembly output. System calls are often different to normal C function calls too. There are lots of tutorials online, e.g. http://www.cin.ufpe.br/~if817/arquivos/asmtut/index.html#syscalls – Tony Delroy Mar 28 '11 at 04:44
2

0a is the newline, aka control-J or \n. Text files don't normally end with a null character in Unix.

Tom Zych
  • 13,329
  • 9
  • 36
  • 53
-1

It depends on who implemented the file format.

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445