0

Using the request module I'm getting the following error on HEAD requests to some shortened, 301-redirecting URLs:

{ [Error: Parse Error] bytesParsed: 123, code: 'HPE_INVALID_CONTENT_LENGTH' }

For example, I get this on http://cnb.cx/1vtyQyv. Very easy to reproduce (node v0.10.29, request v2.36.0):

var request = require('request');
request({ url:'http://cnb.cx/1vtyQyv', method: 'HEAD' }, function(err, res) {
    console.log(err, res);
});

Here is the result of curl HEAD request on this URL:

$ curl -I http://cnb.cx/1vtyQyv
HTTP/1.1 301 Moved Permanently
Server: nginx
Date: Wed, 02 Jul 2014 18:16:05 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Cache-Control: private; max-age=90
Content-Length: 124
Location: http://www.cnbc.com/id/101793181
Mime-Version: 1.0
Set-Cookie: _bit=53b44c65-00194-0369a-281cf10a;domain=.cnb.cx;expires=Mon Dec 29 18:16:05 2014;path=/; HttpOnly

The content length on the body is in fact 124, as can be verified with curl http://cnb.cx/1vtyQyv | wc -c

The error is thrown from within Node.js's core http parser (https://github.com/mattn/http-server/blob/master/http_parser.c), however, strangely, request is able to follow this 301 redirect and successfully returns contents of the target page (http://www.cnbc.com/id/101793181) with no error when doing a GET request, which suggests that the error isn't necessary:

var request = require('request');
request({ url:'http://cnb.cx/1vtyQyv', method: 'GET' }, function(err, res) {
    console.log(err, res);
});

This is an issue using node-unshortener which makes repeated HEAD requests until it finds the full URL.

tobek
  • 4,349
  • 3
  • 32
  • 41

1 Answers1

3

It works for me with plain node v0.10.29:

var http = require('http');

http.request({
  host: 'cnb.cx',
  path: '/1vtyQyv',
  method: 'HEAD'
}, function(res) {
  console.dir(res);
  res.resume();
}).end();

The error is reproduced with request v2.36.0 though. You might want to file an issue about it.

UPDATE: The error is reproduced with plain node, the problem is not the shortened URL, but the redirected URL that causes the problem:

http.request({
  host: 'www.cnbc.com',
  path: '/id/101793181',
  method: 'HEAD'
}, function(res) {
  console.dir(res.statusCode);
  console.dir(res.headers);
}).end();

// results in:
//
// events.js:72
//         throw er; // Unhandled 'error' event
//               ^
// Error: Parse Error
//     at Socket.socketOnData (http.js:1583:20)
//     at TCP.onread (net.js:527:27)

UPDATE #2: It turns out that the redirected URL is returning Content-Length: -1, which is causing the error. curl -I http://www.cnbc.com/id/101793181 shows:

HTTP/1.1 200 OK Date: Wed, 02 Jul 2014 22:23:49 GMT Server: Apache Vary: User-Agent Via: 1.1 aicache6 Content-Length: -1 X-Aicache-OS: 10.10.1.25:80 Connection: Keep-Alive Keep-Alive: max=20

mscdex
  • 104,356
  • 15
  • 192
  • 153
  • Thanks, never used bare `http.request` before but I think that makes it pretty clear it's a bug with `request` - I'll file an issue. – tobek Jul 02 '14 at 20:16
  • Oh I didn't even notice you're the same person who responded on GitHub. Anyway, yep, you are totally correct, sorry for the false alarm. I've seen a bunch of sites do this now. As a result, for the purposes of unshortening shortened URLs, we must switch to GET: https://github.com/Swizec/node-unshortener/pull/19/files – tobek Jul 03 '14 at 00:12