18

I am creating my own email tracking system for email marketing tracking. I have been able to determine each persons email client they are using by using the http referrer but for some reason GMAIL does not send a HTTP_REFERRER at all!

So I am trying to find another way of identifying when gmail requests a transparent image from my server. I get the following headers print_r($_SERVER);:

DOCUMENT_ROOT  =  /usr/local/apache/htdocs

GATEWAY_INTERFACE  =  CGI/1.1

HTTP_ACCEPT  =  */*

HTTP_ACCEPT_CHARSET  =  ISO-8859-1,utf-8;q=0.7,*;q=0.3

HTTP_ACCEPT_ENCODING  =  gzip,deflate,sdch

HTTP_ACCEPT_LANGUAGE  =  en-GB,en-US;q=0.8,en;q=0.6

HTTP_CONNECTION  =  keep-alive

HTTP_COOKIE  =  __utmz=156230011.1290976484.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=156230011.422791272.1290976484.1293034866.1293050468.7

HTTP_HOST  =  xx.xxx.xx.xxx

HTTP_USER_AGENT  =  Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.237 Safari/534.10

PATH  =  /bin:/usr/bin

QUERY_STRING  =  i=MTA=

REDIRECT_STATUS  =  200

REMOTE_ADDR  =  xx.xxx.xx.xxx

REMOTE_PORT  =  61296

REQUEST_METHOD  =  GET

Is there anything of use in that list? Or is there something else I can do to actually get the http referrer, if not how are other ESPs managing to find whether gmail was used to view an email?

Btw, I appreciate it if we can hold back on whether this is ethical or not as many ESPs do this already, I just don't want to pay for their service and I want to do it internally.

Thanks all for any implementation advice.

Update

Just thought I would update this question and make it clearer in light of the bounty.

I would like to find out when a user opens my email when sent to a GMail inbox. Assume, I have the usual transparent image tracking and the user does not block images.

I would like to do this with the single request and the header details I get when the transparent image is requested.

Abs
  • 56,052
  • 101
  • 275
  • 409
  • GMail are intentionally blocking this, so there may be no way around it: http://stackoverflow.com/questions/4264846/how-to-get-the-http-referer-from-a-yahoo-or-gmail – Pekka Feb 19 '11 at 19:11
  • @Pekka - hmm I see. Any idea how other email tracking services identify if GMail was used by a subscriber? Maybe the remote port stays the same for GMail?? If you look at this image, you can see campaign monitor is able to identify GMail! http://i3.campaignmonitor.com/uploads/images/email-clients-big.jpg – Abs Feb 19 '11 at 19:14
  • interesting. I'd have said they just do a cheap check on `@gmail.com` addresses but they claim they can really, actually find out. No idea how they do that – Pekka Feb 19 '11 at 19:16
  • @Pekka - yes, that is why I am surprised and I would like to do the same to be honest! – Abs Feb 19 '11 at 19:20
  • 1
    @Pekka while that might be the case for clicking links, it's not the case for images. What's happening here is that HTTPS->HTTP does not leak referrer information. The solution is to host the images on HTTPS. – Yahel Feb 22 '11 at 03:55

2 Answers2

19

Are your images requested with HTTP or HTTPS?

If so, that's the problem.

HTTPS->HTTP referrals do not leak a Referer Header (HTTP_REFERER).

If you embed a HTTP hosted image in an email that is requested from an HTTPS page, it won't send a referrer. (HTTP pages requesting HTTPS, however, do send a referer).The solution is to embed the image as HTTPS. I've tested it, and sure enough, secure HTTPS images do indeed send the Referrer.

One way Gmail could block the referrer information on loaded images by default is if they used a referrer policy, which is supported on most modern browsers. (As of 2011, they did not implement such a policy.)

See the below screenshot of an embedded image that is generated dynamically with the HTTP REFERER of the request: enter image description here

Yahel
  • 37,023
  • 22
  • 103
  • 153
  • So I would have to have the email address in the tracking code? I was hoping not to go to those lengths as I am able to determine other email clients without doing this and it seems a bit of a pain doing it for just GMail users! – Abs Feb 21 '11 at 23:45
  • @Abs preferably not, as that's less secure. Instead, some sort of hash or token that you could match to a particular email address on the backend. But, yes, this does create a bit of an analytics nuisance if you're not already doing it. – Yahel Feb 21 '11 at 23:47
  • 1
    @Abs the requesting IP address is of the end user/client, not of the Gmail server. – Yahel Feb 21 '11 at 23:48
  • Just realised and then removed my comment! Didn't see your comment though. – Abs Feb 21 '11 at 23:49
  • +1 very nice. I would drop the "test the E-Mail address" bit entirely and always go with the image method, though - GMail users could be fetching their mail through POP. – Pekka Feb 22 '11 at 09:27
  • You are right, my image is requested on plain HTTP. Very interesting, I am going to try HTTPS...after work! – Abs Feb 22 '11 at 10:10
  • Actually, I got to try it, didn't know one of my old servers had SSL. And it works beautifully! Thank you very much yc! I have to wait 12 more hours to award you bounty. :) – Abs Feb 22 '11 at 10:26
  • You could use a protocol relative url here so it will work on http or https. Just link to "//domain.com/image.jpg" instead of "https://domain.com/image.jpg". Read more here: http://paulirish.com/2010/the-protocol-relative-url/ – Lance Fisher Jun 24 '11 at 06:18
  • @LanceFisher it'd be worth trying out; I'd be afraid that email clients, which have a habit of doing evil things to markup, might balk or mangle it. – Yahel Jun 24 '11 at 22:47
  • +1, I wish I could give like +5. This is exactly what I need. My only concern is though, will that leak actually be plugged? because it will completely ruin this whole operation if it does. – Etienne Marais Jan 22 '13 at 10:08
  • Nope, referrer will always be preserved with images embedded HTTPS->HTTPS; to do otherwise would break with the spec. – Yahel Jan 22 '13 at 16:40
  • @LanceFisher using protocol relative URLs is not a good idea in email. It can cause Outlook to crash or load very very slowly, since Outlook looks for the image on the local file system rather than on the web. – Yahel Jan 28 '13 at 15:25
  • 4
    In case someone finds this thread, Gmail now proxies all images in emails to prevent per user image based tracking. http://gmailblog.blogspot.com/2013/12/images-now-showing.html You can get around this by creating unique image filenames for each user. More on that here: http://www.redant.com.au/how-we-do/cache-busting-gmail-new-image-caching/ – Brady Emerson Aug 11 '14 at 18:01
0

Make the link something like http://www.example.com/image.jpg?h=8dh38dj

image.jpg is a PHP file and 8dh38dj is the hash of the email you included the link in. When the user requests the file, your PHP script will get '8dh38dj', look that up in your database and find the matching email. Parse the domain i.e. gmail.com from example@gmail.com and you know it is from gmail. To make jpg files execute as PHP, use an AddHandler in php

Nick
  • 3,096
  • 3
  • 20
  • 25
  • 1
    This doesn't tell him which E-Mail client the user is using, though. – Pekka Feb 19 '11 at 19:17
  • That will work for GMail I guess, but it will require some code change whereas if I can do it with just the http headers like I have done for hotmail, yahoo, outlook etc it would make my job so much easier! – Abs Feb 19 '11 at 19:19
  • 2
    GMail provide IMAP access. GMail users may not be using the GMail client. – Quentin Feb 19 '11 at 19:22
  • @David - very good point. But I am aware of this and many other things that will skewe my data but we all know email tracking isn't an exact science!! – Abs Feb 19 '11 at 19:33
  • 2
    @Abs GMail also provide hosted mail on custom domains... It's likely to skew the results massively. Other than that, this method is fine to find out whether an E-Mail has been read (although it's not waterproof, many mail clients block external resources) – Pekka Feb 19 '11 at 21:21
  • 1
    @Pekka @David - this approach is actually pretty good when you combine it with user agent. For IMAP access, the email client (thunderbird, outlook etc) will send the user agent, and that is enough to figure out the client. For browser access, the approach Nick suggested is great. For custom domains, you just need to do a MX lookup - the MX records will point to googles servers. – Sripathi Krishnan Feb 26 '11 at 20:03