1

A referrer policy restricts when a Referer header is set on requests and, if the header is allowed, what parts of the referring URL are available. This is a privacy consideration for users. If I'm on one web site, and I follow a link out to another web site, I might not want the other web site to know what particular page referred me (or even what site referred me).

Pages often have third-party JavaScript on them, and that code itself may make requests. For example, there are quite a few tracking pixels for advertising networks. Publishers place these bits of code on virtually all the pages of their sites. The tracking pixels frequently are GET requests for 1x1 GIFs; sometimes they are POSTs with extensive bodies.

Other CSPs can control what JavaScript is allowed to execute. I would have thought that the Referrer-Policy would also restrict the data that JavaScript could access. In particular, I would've thought that a page with a restrictive Referrer-Policy would prevent non-origin JavaScript from accessing (or accessing in detail) data from document.location.

But it doesn't.

As an example, visit https://www.penguinrandomhouse.com/books/reference (archive link). It loads a tracking pixel from TikTok. (I'm not trying to pick on TikTok here; there quite a few others on that page.) You'll see the pixel JavaScript make a POST to https://analytics.tiktok.com/api/v2/pixel. As expected, on the request is a referer: https://www.penguinrandomhouse.com/ because of the Referrer-Policy: strict-origin on the parent page; it's missing the /books/reference path. However, the body of the POST is:

{
  "event":"Pageview",
  "message_id":"messageId-1679954283495-7969080450692-C4SGAO96H18A0MH1EN5G",
  "event_id":"",
  "is_onsite":false,
  "timestamp":"2023-03-27T21:58:03.495Z",
  "context":{
    "ad":{"sdk_env":"external","jsb_status":2},
    "user":{"anonymous_id":"CXAXbvf3x_YpEzw4PhoPlJ6MGBI"},
    "pixel":{"code":"C4SGAO96H18A0MH1EN5G"},
    "page":{
      "url":"https://www.penguinrandomhouse.com/books/reference",
      "referrer":"https://www.penguinrandomhouse.com/"
    },
    "library":{"name":"pixel.js","version":"2.1.33"},
    "device":{"platform":"pc"},
    "session_id":"45a166a9-ccea-11ed-9ebf-08c0eb4a4ce6::NaPDjdJMkNgtY5tnReSS-C4SGAO96H18A0MH1EN5G",
    "pageview_id":"pageId-1679954283473-2599747993441-C4SGAO96H18A0MH1EN5G",
    "variation_id":"test_3",
    "userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36"
  },
  "_inspection":{},
  "properties":{}
}

That clearly has the entire URL of the referring page. My question is: why don't browsers block this? This seems like a clear loophole to the Referrer-Policy.

Would this be a better question for https://security.stackexchange.com/?

Armadillo Jim
  • 291
  • 2
  • 7
  • This is indeed a legitimate concern. Google Analytics configuration options allow to control and restrict what is being collected by the script, but for 99% of marketing tags, it's not possible. One option is to migrate to server-side tagging, where the hit payload is rewritten and can be cleaned somehow. – Open SEO Aug 26 '23 at 20:38
  • When using a Tag Template in GTM Client Side, it is also possible to restrict the data accessible to each tag, for example allowing the TikTok template to access only parameters specific to TikTok and nothing else https://developers.google.com/tag-platform/tag-manager/templates/permissions#get_url – Open SEO Aug 26 '23 at 20:47

1 Answers1

3

Pages often have third-party JavaScript on them

I would've thought that a page with a restrictive Referrer-Policy would prevent non-origin JavaScript from accessing (or accessing in detail) data from document.location.

You might be mixing this up with the cases that the same-origin policy, which prevents separate origins from accessing each other, applies to. JavaScript loaded with <script src="…"> runs in the context of the origin of the <script> element, not one that corresponds to the URL it was loaded from. (The reverse would be a much bigger security problem, actually.) There’s no point in preventing this kind of JavaScript from accessing specific properties like document.location given that it has full permissions to make same-origin requests and access the content of the page, and no real separating JavaScript based on where it was loaded from in the first place (if you load jQuery from cdnjs.com, are you allowed to do $('<a href>').prop('href') or not?).

In short: there’s no “third-party JavaScript” distinction once it loads, and blocking all third-party JavaScript breaks the web.

Ry-
  • 218,210
  • 55
  • 464
  • 476
  • Thanks for the answer! Yep, I know about same-origin policies and CORS. I s'pose what weirds me out is that a site is saying with its Referrer-Policy, "hey, we don't want you to have the (full) referrer". But the site is also saying with its CSP, "it's OK if you run your JavaScript here". I see some tension between those two assertions. – Armadillo Jim Mar 28 '23 at 23:51
  • I'm puzzled that browsers' resolution is to allow the access. I woulda thunk this violates the principle of least surprise, that sites are unwittingly allowing access that they didn't realize they were allowing. How do browser makers square that circle? Maybe this isn't a good SO question since it's more philosophical in a way, but it does have some basis in what security model a browser choses. – Armadillo Jim Mar 28 '23 at 23:51
  • @ArmadilloJim: Sites are *explicitly* opting in to letting third parties do whatever they want by embedding their scripts. It might be unwitting if the site creator is just copying and pasting unthinkingly, but the browser behaviour makes perfect sense here. Referrer-Policy applies to all outgoing links, so it’s not as if it has no effect in the presence of third-party scripts. – Ry- Mar 29 '23 at 02:28