I zipped a large regular unix file (.dat) using tar -cvzf command . This file is of around 200 gb in size. After zipping it became 27gb in size. But while reading data in that zipped file i can see annonymous data added at start of file. Is this possible? I tried to unzip that file again and found that unzipped file has no such anonymous records.
-
Please explain what is an anonymous record for you. – Basile Starynkevitch May 12 '20 at 06:10
-
I can see file name , permissions and few more bytes of data preceding with '\x00' – Sudoshree May 12 '20 at 07:06
-
Yes, see header `tar.h` mentioned in my answer – Basile Starynkevitch May 12 '20 at 12:19
-
Yup. that helped a lot. Thanks. – Sudoshree May 12 '20 at 12:40
1 Answers
The GNU tar command is free software. Please study its source code. Read of course its man page, tar(1).
Indeed, a tar
archive starts with a header documented in header file tar.h
. There is a POSIX standard related to tar.
See also Peter Miller's tardy utility.
Don't confuse tar
archives with zip
ones handled by Info-ZIP (so zip
and unzip
commands).
GNU zip -a compressor, the gzip
program which can be started by tar
, notably your tar czvf
command- is also free software, and of course you should study its source code if interested.
Some Unix shells (notably sash or busybox) have a builtin tar
.
I tried to unzip that file again and found that unzipped file has no such anonymous records.
AFAIK, most Linux filesystems try to implement more or less the POSIX standard -based upon read(2) and write(2) system calls, and they don't know about records. If you need "records", consider using databases (like sqlite or PostGreSQL) or indexed files (like GDBM) - both built above Linux file systems or block devices.
Read also a good textbook on operating systems.
Notice that "a large regular unix file" is mostly a sequence of bytes. There is no notion of records inside them, except as a convention used by other user-space programs thru syscalls(2). See also path_resolution(7) and inode(7).

- 223,805
- 18
- 296
- 547
-
Thank you Basile for above info . I will go through tar.h file in my system to dig more into this. – Sudoshree May 12 '20 at 06:36
-
Thank you Basile for this answer. I get all details abt tar process in wiki page that you shared in above answer. Thanks :) – Sudoshree May 12 '20 at 07:08