Im trying to understand the gzip speicifaction (http://www.zlib.org/rfc-gzip.html)
Especial the section 2, Overall conventions:
Bytes stored within a computer do not have a "bit order", since they are always treated as a unit. However, a byte considered as an integer between 0 and 255 does have a most- and least-significant bit, and since we write numbers with the most-significant digit on the left, we also write bytes with the most-significant bit on the left. In the diagrams below, we number the bits of a byte so that bit 0 is the least-significant bit, i.e., the bits are numbered:
+--------+
|76543210|
+--------+
This document does not address the issue of the order in which bits of a byte are transmitted on a bit-sequential medium, since the data format described here is byte- rather than bit-oriented.
Within a computer, a number may occupy multiple bytes. All multi-byte numbers in the format described here are stored with the least-significant byte first (at the lower memory address). For example, the decimal number 520 is stored as:
0 1
+--------+--------+
|00001000|00000010|
+--------+--------+
^ ^
| |
| + more significant byte = 2 x 256
+ less significant byte = 8
The problem that i have is, im not sure how to calcualte the length for the FEXTRA
header:
+---+---+=================================+
| XLEN |...XLEN bytes of "extra field"...| (more-->)
+---+---+=================================+
If i have one (sub)-field with a string, length of 1600 bytes (characters) then my complete FEXTRA
length should be 1600 (payload) + 2 (SI1&I2, subfield ID), right ?
But the length bytes are set to 73 & 3 and i am not sure why.
Can someone clarify how i can calculate the complete FEXTRA
length with the two length bytes ?
Im using nodejs for the operations on the .tgz/.gz file.
Demo code:
const fs = require("fs");
//const bitwise = require("bitwise");
// http://www.zlib.org/rfc-gzip.html
// http://www.onicos.com/staff/iz/formats/gzip.html
// https://de.wikipedia.org/wiki/Gzip
// https://dev.to/somedood/bitmasks-a-very-esoteric-and-impractical-way-of-managing-booleans-1hlf
// https://www.npmjs.com/package/bitwise
// https://stackoverflow.com/questions/1436438/how-do-you-set-clear-and-toggle-a-single-bit-in-javascript
fs.readFile("./test.gz", (err, bytes) => {
if (err) {
console.log(err);
process.exit(100);
}
console.log("bytes: %d", bytes.length);
let header = bytes.slice(0, 10);
let flags = header[3];
let eFlags = header[8];
let OS = header[9];
console.log("Is tarfile:", header[0] === 31 && header[1] === 139);
console.log("compress method:", header[2] === 8 ? "deflate" : "other");
console.log("M-Date: %d%d%d%d", bytes[4], bytes[5], bytes[6], bytes[7]);
console.log("OS", OS);
console.log("flags", flags);
console.log();
// | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
// +---+---+---+---+---+---+---+---+---+---+
// |ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->)
// +---+---+---+---+---+---+---+---+---+---+
//
//
// |10 |11 |
// +---+---+=================================+
// | XLEN |...XLEN bytes of "extra field"...| (more-->)
// +---+---+=================================+
// (if FLG.FEXTRA set)
//
// bit 0 FTEXT
// bit 1 FHCRC
// bit 2 FEXTRA
// bit 3 FNAME
// bit 4 FCOMMENT
// bit 5 reserved
// bit 6 reserved
// bit 7 reserved
// bitwise operation on header flags
const FLAG_RESERVED_3 = (bytes[3] >> 7) & 1;
const FLAG_RESERVED_2 = (bytes[3] >> 6) & 1;
const FLAG_RESERVED_1 = (bytes[3] >> 5) & 1;
const FLAG_COMMENT = (bytes[3] >> 4) & 1;
const FLAG_NAME = (bytes[3] >> 3) & 1;
const FLAG_EXTRA = (bytes[3] >> 2) & 1;
const FLAG_CRC = (bytes[3] >> 1) & 1;
const FLAG_TEXT = (bytes[3] >> 0) & 1;
console.log("FLAG_RESERVED_3", FLAG_RESERVED_3);
console.log("FLAG_RESERVED_2", FLAG_RESERVED_2);
console.log("FLAG_RESERVED_1", FLAG_RESERVED_1);
console.log("FLAG_COMMENT", FLAG_COMMENT);
console.log("FLAG_NAME", FLAG_NAME);
console.log("FLAG_EXTRA", FLAG_EXTRA);
console.log("FLAG_CRC", FLAG_CRC);
console.log("FLAG_TEXT", FLAG_TEXT);
console.log();
if (FLAG_EXTRA) {
let len1 = bytes[10];
let len2 = bytes[11];
console.log("Extra header lenght", len1, len2);
}
});
EDIT 2: After reading it over and over and over again, i think i got it:
if (FLAG_EXTRA) {
let len1 = bytes[10];
let len2 = bytes[11];
console.log("Extra header lenght", len1 + (len2 * 256));
}
len1 (byte 0) is a nuber till 256, len2 is a multplicator for 256.
len2 * 256
+ len1 = FEXTRA
header length.
Can someone correct me if im wrong ?!
Thank you guys!