14

Currently it seems the default behaviour of puppeteer is to follow redirects and return the DOM at the end of the chain.

How can I make the .goto() method to stop after the first redirect happened and simply return the html from that first 3xx page when i call page.content() method?

makeitmorehuman
  • 11,287
  • 3
  • 52
  • 76

3 Answers3

14

You can enable a request interception and abort additional requests if a chain of requests is identified:

await page.setRequestInterception(true);

page.on('request', request => {
  if (request.isNavigationRequest() && request.redirectChain().length !== 0) {
    request.abort();
  } else {
    request.continue();
  }
});

await page.goto('https://www.example.com/');
Grant Miller
  • 27,532
  • 16
  • 147
  • 165
1

It seems that at the moment of writing, this is not possible (at least not in the high-level API that Puppeteer provides). Check out the docs for goto here.

tomahaug
  • 1,446
  • 10
  • 12
  • 1
    Yes, its not possible with the high level api. logged an issue here: https://github.com/GoogleChrome/puppeteer/issues/1132 – makeitmorehuman Oct 23 '17 at 15:54
  • Awesome. Great workaround! I thought about using the event handles, but I was not certain that it could be done, so I didn't want to potentially lead you down a blind path. – tomahaug Oct 23 '17 at 18:19
0

I made a few modifications to the top answer and now we can track the specific status code number.

await page.setRequestInterception(true);

    page.on('request', request => {
        if (request.isNavigationRequest() && request.redirectChain().length >= 1) {
            const redirectResponse = request.redirectChain()[request.redirectChain().length - 1].response();
            request.respond({
                status: redirectResponse.status(),
                contentType: 'text/plain',
                body: 'Redirects!',
            });
        } else {
            request.continue();
        }
        // frivolous comment to comply with post edit minimum length
    });