Getting images with HTTP Request in C

Question

I am writing a program in C that acts like a proxy server in a Linux system: Client asks it for a web page, it sends an HTTP GET Request to a distant server, and it gets the servers response (web page), which is saved in an .html file.

Here goes my problem: Most web sites got some references to images, so when i try to view the .html file proxy created, the images don't appear.

I have searched a lot, but found nothing..Is there a way to write some code to GET images too?

Thank you in advance

score 1 · Accepted Answer · answered Nov 28 '11 at 01:25

You're going to have to write code that parses the HTML file you get back and looks for image references (img tags), then queries the server for those image files. This is what web browsers are doing under the hood.

You have an additional problem though which is that the image references in the HTML file are to the original server. I'm assuming that since they don't load for you the server that returned the original HTML isn't available. In that case after you get each image file you will need to give it a name on the local filesystem and then alter the reference in the HTML (programmatically) to point to your new local image name.

So for example:

<img src='http://example.com/image1.png'>

would become

<img src='localImage1.png'>

If you're querying arbitrary websites then you'll also find that there are various other files you'll need to do the same with like CSS files and JavaScript files. In general its hard to mirror arbitrary web pages accurately - browsers have complex object models they use to interpret web pages because they have to deal with things like CSS and Javascript and you may need to be able to 'run' all that dynamic code to even be sure what files to download from the server (e.g. JavaScript including other JavaScript etc).

Getting images with HTTP Request in C

1 Answers1