0

I'm trying to enter a website with my credentials and download a pdf using puppeter. I got the pdf url using puppeteer, but now I want to use node-fetch to access that page. To fetch the pdf page I need to include the session data on options, but I don't know if I am doing it the right way.

I tried using 'credentials: 'include', getting the cookies with page.cookies and other small modifications in the options sent with the fetch.

 var response = await page.goto(urlPdf);
 var headersPup = response.request().headers(); 

 const { cookies } = await page._client.send("Network.getAllCookies", {});

  const sessionFreeCookies = cookies.map((cookie) => {
    return {
      ...cookie,
      expires: Date.now() / 1000 + 10 * 60,
      session: false
    };
  });

  headersPup['Cookie'] = sessionFreeCookies; //adding the cookies to header
  headersPup['Content-Type'] = 'application/pdf';//adding content-type

  var opts = {
      method: "GET",
      headers: headersPup,
      credentials: "include",
  }

  await fetch(urlPdf,opts).then(response => response
    .body.pipe(fs.createWriteStream('test4.pdf'))
    .on('close', () => console.log('pdf downloaded')));

When I open test4 as txt I can see the login page html, it means I lost the session. How can I keep the session to download my pdf?

Eduardo Conte
  • 1,145
  • 11
  • 18

1 Answers1

0

Of course fetch method can't keep the session, it is not opened in your headless browser.

Unfortunateley seems pdf downloading is not supported on puppeteer: https://github.com/GoogleChrome/puppeteer/issues/1248

In general to be logged in you need to goto(loginPage) adn then goto the page that you need, cookies are managed within the page object.

Pjotr Raskolnikov
  • 1,558
  • 2
  • 15
  • 27
  • It seems I'm missunderstanding how to use fetch. In fact, I used fetch because I knew that pdf downloading is not suported using puppeteer, so I wanted to get until there with puppeteer (becaise it's easier) and then finish the job using fetch and the session data. I really don't understand what you mean by: "Of course fetch method can't keep the session, it is not opened in your headless browser." Is there a way to change my code so I can fetch without using puppeteer and chromium? By the way, thanks for answering! – Eduardo Conte Jan 23 '19 at 14:35
  • Are you sure you logged in in puppeteer when you do const { cookies } = await page._client.send("Network.getAllCookies", {}); ? – Pjotr Raskolnikov Jan 23 '19 at 14:42
  • Yes because I am debuggin with "headless = false" and it goes until the pdf with no problem (or until the page before if I set to go just until there and get the cookies there) . And the cookies seem to be right. – Eduardo Conte Jan 23 '19 at 15:15