0

We're trying to create a trackback system where an outside web publisher can put some html on a page on their website that links back to a specific product page on our site. Let's call it a 'badge' for purposes of this question.

Once they've inserted the badge, we want to identify this, then grab the < h1 > and first < p > as a teaser to comprise a link from our site back to theirs and write all this stuff to our database. Then, our users can see the title and first bit of their page, then decide if they want to see more.

Here's what we've done (not much I'm afraid):

<a href="http://www.mysite.com/abc.html">
<img alt="abc" src="http://www.mysite.com/logo.gif" style="width:200px;height:100px" />       
</a>

We're planning to build an admin page to do the last part of grabbing the < h1> and < p> and posting it to the live database, etc. and we'll figure this out later.

However, the middle step (identifying that this piece of html has been used) we're at a loss.

Is this something we should be doing through a log file....I have no clue even how to begin thinking about it.

A little direction of where to begin working on this problem would be very helpful.

Thanks in advance!!

Kevin
  • 1,685
  • 7
  • 28
  • 55
  • Where are the `

    ` and `

    ` of which you speak?

    – glomad Jan 10 '13 at 22:23
  • These would be in the outside website that a publisher put our 'badge' on. Disregard this I suppose...I confused the issue. The main question is how can we know that someone has pasted that html on to their site? – Kevin Jan 10 '13 at 22:51
  • To just get notified that a user pasted the code in their web site you can use only a hidden `img` tag and setting it's `src` to a server side script but it would be one sided. – The Alpha Jan 10 '13 at 23:07
  • Dumb question maybe, but what do you mean by one sided? Whose side? – Kevin Jan 10 '13 at 23:11
  • You can't get any data back to the client side. – The Alpha Jan 10 '13 at 23:12
  • Got it, but we could know on our side that the script has been run and then take action? However, we'd need to know the URL of the page on their website that they put the code. It seems like this would be possible under what you are describing. If so, that would be perfect. Thanks for the suggestion. – Kevin Jan 10 '13 at 23:14

1 Answers1

2

This is one approach.

You give them HTML which looks something like:

<a href="http://www.mysite.com/abc.html">
    <img alt="abc" src="http://www.mysite.com/logo.php" style="width:200px;height:100px" />       
</a>

Notice that says logo.php, not logo.gif.

logo.php will live on your server. Its purpose is twofold:

  1. Gather information about the page holding the <img> tag
  2. Load and output logo.gif so the users see the image as expected.

If you embed that html on a webpage somewhere, logo.php will have information about where the request for the image originated. Specifically, $_SERVER['HTTP_REFERER'] will give you the complete URL to the page where the img tag resides. It is then up to you to decide how to process and store that information.

I don't know exactly what you want to do, but a very simplified logo.php would look something like this:

<?php
$url = $_SERVER['HTTP_REFERER'];

// do something with $url... 
// it will be something like "http://theirsite.com/wherever/they/pasted/the.html"

// now output the logo image...
header("Content-Type: image/gif");
echo file_get_contents("/path/to/logo.gif");

Keep in mind that every time anyone hits their page with the image tag, logo.php will be run. So don't accidentally create 10000 links back to their site on your site :)

glomad
  • 5,539
  • 2
  • 24
  • 38
  • Thanks, interesting idea. Right, that php page would have to check and make sure the URL wasn't already in the database before writing anything new. – Kevin Jan 10 '13 at 23:18
  • This is a very simplified overview, but this is the pattern that is always used to do this kind of thing - there is really no other way aside from embedding a flash or java object. Now you could take it further and use cURL to send a scraper over to $url to retrieve the HTML near the img tag on their page, but you will probably find that to be overkill. – glomad Jan 10 '13 at 23:23
  • Slight amendment - You could also give them a chunk of javascript to embed, instead of an img tag - this is how things like Google Analytics work - but if you're just starting out, this method (img tag + PHP script) will help you understand the model. – glomad Jan 10 '13 at 23:28
  • What are your thoughts about this killing our server? If a blog post was slammed with traffic one day, would this php page continually running be a huge load? Or is it not-too-bad? – Kevin Jan 10 '13 at 23:38
  • Assuming you don't go crazy with whatever you do with `$url`, I doubt you've got anything to worry about. A great man once said "[Premature optimization is the root of all evil.](http://c2.com/cgi/wiki?PrematureOptimization)" :) – glomad Jan 10 '13 at 23:41
  • Ha. Good point! Sorry, 1 last question: What are your thoughts on your method versus hidden img tag/src idea from above? Is this a case of 2 ways to skin a cat? – Kevin Jan 10 '13 at 23:48
  • I think we're saying exactly the same thing... In my example I set the src attribute to `http://mysite/logo.php`. //edit... I missed the 'hidden' part. Yes, 2 ways to skin a cat. The end result is identical, your webserver will end up doing nearly the same amount of work in both cases – glomad Jan 10 '13 at 23:50
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/22555/discussion-between-kevin-and-ithcy) – Kevin Jan 11 '13 at 04:30