I have collected all the requests made by websites with the aim to identify the third-parties through the requests which are made by a website. I used selenium and WebDriver to do that.
These requests can be made by the JavaScript present in the source code of the website or can be dynamically called by the web-page from the advertisements or can be initiated by Google or DoubleClick or Facebook. These requests help to track the data that is being shared by these websites with or without the user consent.
You can see an example of the requests when the browser wants to load this website: www.focuscamera.com/ in this excel file:
https://drive.google.com/file/d/16wNA0dFUehrjPww31TAIj8GZUZ05LsIU/view?usp=sharing
My questions are:
1- which kind of HTTP header field can be used for my analysis if I tend to gather some info about third parties? my goal is to distinguish and differentiate the third party behavior!
For example, the field content-length in the requests indicates the size of the entity-body. So a request with higher content-length means that the third party received and collect more data/information?
2- What does exactly content-length indicates? what does exactly "HTTP request body data" contain?
3- Are there any other HTTP header fields that I can use if I aim to distinguish and differentiate the third party behavior? ( a list of field I collect can be found in sheet1 of the excel file I shared before)
4- Are there any other information on the internet that I can use if I aim to distinguish and differentiate the third party behavior? For example, I use cookiepedia.co.uk
in order to know what kind of services third parties provide? is it functionality, performance, or Targeting/advertising?