1

Everyone on the net wants to disable puppeteer from enforcing the same-origin policy, I actually want to test that same-origin-policy enforcement works! but it doesn't and i don't know why, this is what I do:

import puppeteer from 'puppeteer';

(async () => {
  
    const browser = await puppeteer.launch({ headless: false });
    const page = await browser.newPage();
    await page.goto('https://example.com'); // <= THE ORIGIN
    
    // Some error listeners
    page.on("pageerror", function(err) {  
        theTempValue = err.toString();
        console.log("Page error: " + theTempValue); 
    });
    page.on("error", function (err) {  
        theTempValue = err.toString();
        console.log("Error: " + theTempValue);
    });
    
    // THE EVALUATE
    await page.evaluate(async (otherSiteURL, myOrigin) => {
        
        console.log('the origin: '+ myOrigin); // prints https://example.com
        console.log('otherSiteURL: '+ myOrigin); // prints https://some-other-site.com
        let r = await fetch(otherSiteURL, {mode:'cors', method:"GET", headers: {Origin:myOrigin}});
        r = await r.json();
        
        // *** I have access to the response and no error has been triggered ***
        console.log('Response is visible, while it shouldn't be due to cors: ' + JSON.stringify(r));
        
    }, 'https://some-other-site.com', 'https://example.com');

    await browser.close();
})();

As you can see I am in the context of page 'https://example.com' and from evaluate function I am executing a fetch to some-other-site url which i am the owner of and indeed I've checked that the Access-Control-Allow-Origin header returned in the response is not * and not https://example.com, Hence this is 100% same-origin-policy violation and should result in the browser blocking access to the response, and reporting a CORS error

So Why on earth Puppeteer allows me to see the response and not throwing a CORS error ? what am I doing wrong - I WANT to see CORS error!

Mercury
  • 7,430
  • 3
  • 42
  • 54
  • Have you tried the exact same fetch call to your site from a normal browser console by hand on https://www.example.com? It shouldn't be any different when Puppeteer runs that code on your behalf, so I suspect a misconfiguration even though you're 100% sure there isn't. Can you provide a [mcve]? Thanks. – ggorlen Dec 01 '22 at 01:46
  • example.com is not mine. I though I can run any javascript code that would run in the context of the current page (example.com) and hence get CORS. only the `some-other-site` URL is owned by me, the example.com is the site that puppeteer started code / examples use. – Mercury Dec 01 '22 at 15:54
  • If I really need the real web site I am on (example.com in my case) to actually have the javascript code that invokes `some-other-site` - that can explain why I am not seeing CORS error. If that's the case I dont understand the evaluate function - the `evaluate` docs states: Evaluates a function in the page's context and returns the result. but if CORS is not triggered so the javascript code not really runs in the context of the page I am on. – Mercury Dec 01 '22 at 15:58
  • I think I understand that. I'm asking if you run the `fetch` _by hand_ on the other domain, do you see a CORS error or not? The purpose of this is to determine if it's actually Puppeteer that's causing the error or not. It'd be surprising if running it by hand causes CORS but running the same code with Puppeteer doesn't, so I suspect it'll pass CORS in both cases (by hand and in Puppeteer), indicating that it's not a Puppeteer issue at all but a misunderstanding of your CORS configuration. – ggorlen Dec 01 '22 at 15:58
  • How can I run by hand javascript code on a domain that is not mine. I did check that when I bring up a local server on localhost:8000 and execute the fetch invoking `some-other-site` - I do get CORS error ! – Mercury Dec 01 '22 at 16:00
  • To run something by hand, you open the browser, navigate to example.com, open the dev tools and type in the `fetch("your cross origin domain")` call. You're confirming that you _do_ see a CORS error when you run the exact same code as Puppeteer by hand? That's very strange/unexpected and will require a [mcve] to help with (show the actual sites, and if possible, the server CORS config, or even a minimal, complete server that I can run to see the problem). – ggorlen Dec 01 '22 at 16:01
  • Basically, as I mentioned, `evaluate` does nothing special, unless I'm missing something. It literally runs the same code you're running by hand in a browser in more or less the exact same way, only programmatically, so it's very odd that you'd see different CORS results between the two. – ggorlen Dec 01 '22 at 16:04
  • Tested what you said - I ran the same fetch code from dev tools console (from example.com) and got CORS error, but when running from evaluate I don't get CORS error. anyone can simulate this then. go to web site A and fetch manually a resource from web site B (that enforces CORS) you will get CORS error - but do the same thing using evaluate - and you will see no CORS errors - this is ODD – Mercury Dec 01 '22 at 16:09
  • That makes no sense to me unless your CORS policy has explicitly allowed certain domains. See my answer where I show that it's the same thing either way. Please share a [mcve]. – ggorlen Dec 01 '22 at 16:28

1 Answers1

1

I believe your server CORS policy is set up incorrectly relative to your expectations in some way or another, or you're making some sort of fundamental mistake.

Puppeteer's evaluate literally runs the same code you can in the console. From the perspective of the server, both should generate the exact same incoming HTTP request. As far as I know, page.evaluate() does nothing special that'd change the CORS characteristics. It should be easy to corroborate this in the network tab.

For example, running a fetch cross-origin from https://www.example.com to Wikipedia gives a CORS error when executed by hand (by typing the contents of evaluate in the browser console dev tools), as well as programmatically with Puppeteer:

const puppeteer = require("puppeteer"); // ^19.1.0
const {setTimeout} = require("timers/promises");

let browser;
(async () => {
  browser = await puppeteer.launch({
    devtools: true,
    headless: false,
  });
  const [page] = await browser.pages();
  const url = "https://www.example.com";
  await page.goto(url, {waitUntil: "domcontentloaded"});
  const msg = await page.evaluate(`
    fetch("https://en.wikipedia.org/wiki/Things_Fall_Apart")
      .catch(err => err.message)
  `);
  console.log(msg); // => Failed to fetch
  await setTimeout(10 ** 6); // keep open to view console
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());

In both cases, the console shows the expected error

Access to fetch at 'https://en.wikipedia.org/wiki/Things_Fall_Apart' from origin 'https://www.example.com' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

If this code allowed CORS to be bypassed, then either a fundamental security mechanism of the internet would be broken, clearly not something Puppeteer can do; or Puppeteer somehow ran the evaluate from the Wikipedia origin, plainly not the case.

ggorlen
  • 44,755
  • 7
  • 76
  • 106
  • 1
    There was no problems with the other-origin I am managing. It was most likely something wrong with the code I had. using your code triggers CORS issue every time like a charm (even when referring to my some-other-domain), frustrating I don't know what was the issue but the good part that it is working now - CORS error/enforcing happens! thanks alot – Mercury Dec 02 '22 at 08:21
  • Ah and the evaluate works (CORS is enforced) even when passing a function to it `async () => { await fetch(...); }` instead of string like in your example. – Mercury Dec 02 '22 at 08:23
  • 1
    Yeah, it doesn't matter whether you pass a string or a function. Puppeteer serializes it either way, transfers it to the browser, deserializes it and executes the code. – ggorlen Dec 02 '22 at 08:30
  • 1
    I think I see why CORS didnt work, when I run the request using puppeteer written in AWS Synthetic canary test (NodeJS) - CORS enforcement is NOT working! must be AWS synthetic wrapper thing. too bad. – Mercury Dec 07 '22 at 09:33
  • 1
    Nice find. Feel free to post a [self answer](https://stackoverflow.com/help/self-answer) since that can help future visitors. – ggorlen Dec 07 '22 at 15:20