1

Im trying to understand the gzip speicifaction (http://www.zlib.org/rfc-gzip.html)

Especial the section 2, Overall conventions:

Bytes stored within a computer do not have a "bit order", since they are always treated as a unit. However, a byte considered as an integer between 0 and 255 does have a most- and least-significant bit, and since we write numbers with the most-significant digit on the left, we also write bytes with the most-significant bit on the left. In the diagrams below, we number the bits of a byte so that bit 0 is the least-significant bit, i.e., the bits are numbered:

+--------+
|76543210|
+--------+

This document does not address the issue of the order in which bits of a byte are transmitted on a bit-sequential medium, since the data format described here is byte- rather than bit-oriented.

Within a computer, a number may occupy multiple bytes. All multi-byte numbers in the format described here are stored with the least-significant byte first (at the lower memory address). For example, the decimal number 520 is stored as:

    0        1
+--------+--------+
|00001000|00000010|
+--------+--------+
 ^        ^
 |        |
 |        + more significant byte = 2 x 256
 + less significant byte = 8

The problem that i have is, im not sure how to calcualte the length for the FEXTRA header:

+---+---+=================================+
| XLEN  |...XLEN bytes of "extra field"...| (more-->)
+---+---+=================================+

If i have one (sub)-field with a string, length of 1600 bytes (characters) then my complete FEXTRA length should be 1600 (payload) + 2 (SI1&I2, subfield ID), right ?

But the length bytes are set to 73 & 3 and i am not sure why.

Can someone clarify how i can calculate the complete FEXTRA length with the two length bytes ?

Im using nodejs for the operations on the .tgz/.gz file.

Demo code:

const fs = require("fs");
//const bitwise = require("bitwise");

// http://www.zlib.org/rfc-gzip.html
// http://www.onicos.com/staff/iz/formats/gzip.html
// https://de.wikipedia.org/wiki/Gzip

// https://dev.to/somedood/bitmasks-a-very-esoteric-and-impractical-way-of-managing-booleans-1hlf
// https://www.npmjs.com/package/bitwise
// https://stackoverflow.com/questions/1436438/how-do-you-set-clear-and-toggle-a-single-bit-in-javascript


fs.readFile("./test.gz", (err, bytes) => {

    if (err) {
        console.log(err);
        process.exit(100);
    }

    console.log("bytes: %d", bytes.length);

    let header = bytes.slice(0, 10);
    let flags = header[3];
    let eFlags = header[8];
    let OS = header[9];

    console.log("Is tarfile:", header[0] === 31 && header[1] === 139);
    console.log("compress method:", header[2] === 8 ? "deflate" : "other");
    console.log("M-Date: %d%d%d%d", bytes[4], bytes[5], bytes[6], bytes[7]);
    console.log("OS", OS);
    console.log("flags", flags);
    console.log();

    // | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 
    // +---+---+---+---+---+---+---+---+---+---+
    // |ID1|ID2|CM |FLG|     MTIME     |XFL|OS | (more-->)
    // +---+---+---+---+---+---+---+---+---+---+
    //
    //
    // |10 |11 |
    // +---+---+=================================+
    // | XLEN  |...XLEN bytes of "extra field"...| (more-->)
    // +---+---+=================================+
    // (if FLG.FEXTRA set) 
    //
    // bit 0   FTEXT
    // bit 1   FHCRC
    // bit 2   FEXTRA
    // bit 3   FNAME
    // bit 4   FCOMMENT
    // bit 5   reserved
    // bit 6   reserved
    // bit 7   reserved


    // bitwise operation on header flags
    const FLAG_RESERVED_3 = (bytes[3] >> 7) & 1;
    const FLAG_RESERVED_2 = (bytes[3] >> 6) & 1;
    const FLAG_RESERVED_1 = (bytes[3] >> 5) & 1;
    const FLAG_COMMENT = (bytes[3] >> 4) & 1;
    const FLAG_NAME = (bytes[3] >> 3) & 1;
    const FLAG_EXTRA = (bytes[3] >> 2) & 1;
    const FLAG_CRC = (bytes[3] >> 1) & 1;
    const FLAG_TEXT = (bytes[3] >> 0) & 1;


    console.log("FLAG_RESERVED_3", FLAG_RESERVED_3);
    console.log("FLAG_RESERVED_2", FLAG_RESERVED_2);
    console.log("FLAG_RESERVED_1", FLAG_RESERVED_1);
    console.log("FLAG_COMMENT", FLAG_COMMENT);
    console.log("FLAG_NAME", FLAG_NAME);
    console.log("FLAG_EXTRA", FLAG_EXTRA);
    console.log("FLAG_CRC", FLAG_CRC);
    console.log("FLAG_TEXT", FLAG_TEXT);
    console.log();


    if (FLAG_EXTRA) {

        let len1 = bytes[10];
        let len2 = bytes[11];

        console.log("Extra header lenght", len1, len2);

    }


});

EDIT 2: After reading it over and over and over again, i think i got it:

    if (FLAG_EXTRA) {

        let len1 = bytes[10];
        let len2 = bytes[11];

        console.log("Extra header lenght", len1 + (len2 * 256));

    }

len1 (byte 0) is a nuber till 256, len2 is a multplicator for 256.

len2 * 256 + len1 = FEXTRA header length.

Can someone correct me if im wrong ?!

Thank you guys!

Community
  • 1
  • 1
Marc
  • 2,920
  • 3
  • 14
  • 30

0 Answers0