-1

Web URL in question: https://www.theroyalamerican.com/schedule

I am building a node.js script to scrape the web page above using the request-promise package (which grabs the page's html for me). Unfortunately, when I run my code, it gives me a 400 status code from Squarespace (the apparent host of this site).

Strangely, when I browse to the same URL in my web browser, I can pull it up with no problem at all - 200 status code.

I do not have this problem, where my node script and web browser seem to mismatch, with any other web pages. Curious what's going on here...

const rp = require('request-promise');
const $ = require('cheerio');
const url = 'https://www.theroyalamerican.com/schedule';

rp(url)
  .then(function(html) {
     console.log(html);
   })
  .catch(function(err) {
     console.log(err);
    //handle error
  });
Ryan Miller
  • 315
  • 8
  • 18
  • Why do you expect encrypted output? And how do you expect to decrypt it (assuming you receive encrypted data)? – axiac Jul 23 '19 at 21:00
  • I don't expect encrypted output. I expect a simple html page. But after running my script, the response output to my node console is a ton of encrypted-looking text right after the "400 Bad Request" piece. – Ryan Miller Jul 23 '19 at 21:03
  • Edit: I am editing out the part about encrypted text because I'm realizing that this is some artifact from SquareSpace's (this website's host) default 400 page, which I don't think has anything to do with the core problem of why I'm seeing a 400 status code in the first place. – Ryan Miller Jul 23 '19 at 21:09

1 Answers1

0

Check all the headers that are being sent when you request this page in browser vs in Node.js. Probably some header affects the possible response (like Content-Type maybe? or maybe try passing Origin header)

GProst
  • 9,229
  • 3
  • 25
  • 47