You can communicate with HTTP
with raw TCP socket
.Since you didn't provide code, I can't provide code either. If you already know how to connect, send and receive data from server, it should be easy. Just follow the steps below.
Let's assume you want to connect to www.cnn.com
.
1. Convert the the domain name of the website to an IP Address.
2. Connect to that IP address with port 80.
3. Send the string GET / HTTP/1.1\r\nHost: www.cnn.com\r\nConnection: close\r\n\r\n
4. Read from the socket/server. If the server is available, it will respond with the page or html code on that webpage.
5. Close socket connection.
Note that some websites will not respond or will even block you if you don't provide the User-Agent
/Web browser name you are using.
To fix this, in step add, add User-Agent:MyBrowserName \r\n
header to the string. You can fake browsers. You must put \r\n
after each header.
For example, the Chrome browser I am using is Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36.
Your new string that will be sent in Step 3 should look something like this GET / HTTP/1.1\r\nHost: www.cnn.com\r\nConnection: close\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36\r\n\r\n
. You should notice that there is \r\n
after each header. The last header ends with \r\n\r\n
instead of \r\n
.
Other useful headers are Connection: Keep-Alive\r\n
, Accept-Language: en-us\r\n
, Accept-Encoding: gzip, deflate\r\n
,
Replace port 80 with 443 if the website is https
instead of http
. Things get complicated from here because you have to implement the SSL
protocol.
Assuming you want to access page in another directory instead of the home page and the url is http://www.cnn.com/2016/05/13/health/healthy-eating-quiz/index.html
The string to send should look like this:
GET /2016/05/13/health/healthy-eating-quiz/index.html HTTP/1.1\r\nHost: www.cnn.com\r\nConnection: close\r\n\r\n
If you are using proxy, you have to put the whole url after GET
command:
GET GET http://www.cnn.com/2016/05/13/health/healthy-eating-quiz/index.html HTTP/1.1\r\nHost: www.cnn.com\r\nConnection: close\r\n\r\n