0

I have a library, published via npm as "by-request", which can, among other things, auto-decompress web content. Part of the code to handle this situation looks like this:

        if (!options.dontDecompress || !binary) {
          if (contentEncoding === 'gzip' || (options.autoDecompress && /\b(gzip|gzipped|gunzip)\b/.test(contentType))) {
            source = zlib.createGunzip();
            res.pipe(source);
          }
          // *** start of temporary new code ***
          else if (contentEncoding === 'x-gzip' && options.autoDecompress) {
            source = zlib.createGunzip(); // zlib.createUnzip() doesn't work either
            res.pipe(source);
          }
          // *** end of temporary new code ***
          else if (contentEncoding === 'deflate' || (options.autoDecompress && /\bdeflate\b/.test(contentType))) {
            source = zlib.createInflate();
            res.pipe(source);
          }
          else if (contentEncoding === 'br') {
            source = zlib.createBrotliDecompress();
            res.pipe(source);
          }
          else if (contentEncoding && contentEncoding !== 'identity') {
            reject(UNSUPPORTED_MEDIA_TYPE);
            return;
          }
        }

The code had been working pretty well until I tried to read a file of astronomical info from here: https://cdsarc.cds.unistra.fr/viz-bin/nph-Cat/txt.gz?VII/118/names.dat

I was hitting the reject(UNSUPPORTED_MEDIA_TYPE) error handler because I hadn't specifically handled the Content-Type of x-gzip. Simply adding a check for x-gzip, however, didn't fix the problem.

zlib is choking on the data, coming back with this error:

Error: incorrect header check
at Zlib.zlibOnError [as onerror] (node:zlib:190:17)

Is there a different decompression library I need? I've searched around, but haven't found a good solution yet. According to this previous Stack Overflow answer: Difference between "x-gzip" and "gzip" for content-encoding

...gzip and x-gzip should be the same. It's not working out that way. On the other hand, no web browser I've tried has any trouble at all getting and displaying the text from the cdsarc.cds.unistra.fr URL.

kshetline
  • 12,547
  • 4
  • 37
  • 73
  • Have you checked if the response is actually compressed? – syockit Jan 06 '22 at 01:29
  • @syockit, yes. If I try to interpret the content as text it just comes out as garbage. – kshetline Jan 06 '22 at 01:46
  • `gzip` and `x-gzip` mean exactly the same thing. (The `x` marks it as experimental, back when it was.) Are you sure your use of zlib is working in the `gzip` case? – Mark Adler Jan 06 '22 at 01:51
  • @Mark Adler, yes, I'm reaching the code I want to run, it just isn't working. Oddly enough a `gzip` shell command can successfully decompress this data, even though zlib's `createGunzip()` fails. – kshetline Jan 06 '22 at 02:26
  • `createGunzip()` is not from zlib. It is a node.js wrapper for zlib. In any case, if gzip works on the data, then the problem is your use of the decompression routines. – Mark Adler Jan 06 '22 at 02:56
  • @Mark Adler, "createGunzip() is not from zlib. It is a node.js wrapper for zlib" seems to be a distinction without a difference, so I'm not sure what you're getting at. At any rate, plenty of other gzipped data has passed through this same code successfully, until I found this one web site that's giving me grief, so how badly could I be using createGunzip()? I see no tweakable options to play with. – kshetline Jan 06 '22 at 03:38

1 Answers1

0

The following solution is working for me, substituting a shell gzip decompression operation for that provided by zlib and createGunzip(). The only reason I can think of for this fix to work might be that there's something a bit quirky about the zipped data stream provided by the particular web site that was causing the failure, something for which the shell command is tolerant, but zlib is not.

  if (!checkedGzipShell) {
    checkedGzipShell = true;
    hasGzipShell = true;

    const gzipProc = spawn('gzip', ['-L']);

    await new Promise<void>(resolve => {
      gzipProc.once('error', () => { hasGzipShell = false; resolve(); });
      gzipProc.stdout.once('end', resolve);
    });
  }
          if (contentEncoding === 'gzip' || contentEncoding === 'x-gzip' ||
              (options.autoDecompress && /\b(gzip|gzipped|gunzip)\b/.test(contentType))) {
            if (hasGzipShell) {
              const gzipProc = spawn('gzip', ['-dc']);

              source = gzipProc.stdout;
              res.pipe(gzipProc.stdin);
            }
            else {
              source = zlib.createGunzip();
              res.pipe(source);
            }
          }
kshetline
  • 12,547
  • 4
  • 37
  • 73