I have a forking HTTP proxy implemented on my Ubuntu 14.04 x86_64 with the following scheme (I'm reporting the essential code and pseudocode just to show the concept):
socketClient = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)
;bind(socketClient,(struct sockaddr*)&addr, sizeof(addr))
;listen(socketClient, 50)
;newSocket = accept(socketClient, (struct sockaddr*)&cliAddr, sizeof(cliAddr))
;- get request from client, parse it to resolve the requested hostname in an IP address;
fork()
, open connection to remote server and deal the request;- child process: if it is a
GET
request, send original request to server and while server is sending data, send data from server to client; - child process: else if it is a
CONNECT
request, send string200 ok
to client and poll both client socket descriptor and server socket descriptor withselect()
; if I read data from server socket, send this data to client; else if I read data from client socket, send this data to server.
The good thing is that this proxy works, the bad thing is that now I must collect statistics; this is bad because I'm working on a level where I can't get the data I'm interested in. I don't care about the payload, I just need to check in IP and TCP headers the flags I care about.
For example, I'm interested in:
- connection tracking;
- number of packets sent and received.
As for the first, I would check in the TCP header the SYN flag, SYN/ACK and then a last ACK; as for the second, I would just do +1 to a counter of mine every time a char buffer[1500]
is filled with data when I send()
or recv()
a full packet.
I realized that this is not correct: SOCK_STREAM
doesn't have the concept of packet, it is just a continuous stream of bytes! The char buffer[1500]
I use at point 7. and 8. has useful statistic, I may set its capacity to 4096 bytes and yet I couldn't keep track of the TCP packets sent or received, because TCP has segments, not packets.
I couldn't parse the char buffer[]
looking for SYN flag in TCP header either, because IP and TCP headers are stripped from the header (because of the level I'm working on, specified with IPPROTO_TCP
flag) and, if I understood well, the char buffer[]
contains only the payload, useless to me.
So, if I'm working on a too high level, I should go lower: once I saw a simple raw
socket sniffer where an unsigned char buffer[65535]
was cast to struct ethhdr, iphdt, tcphdr
and it could see all the flags of all the headers, all the stats I'm interested in!
After the joy, the disappointment: since raw
sockets work on a low level they don't have some concepts vital to my proxy; raw
sockets can't bind
, listen
and accept
; my proxy is listening on a fixed port, but raw
sockets don't know what a port is, it belongs to the TCP level and they bind
to a specified interface with setsockopt
.
So, if I'd socket(PF_INET, SOCK_RAW, ntohs(ETH_P_ALL))
I should be able to parse the buffer where I recv()
and send()
at .7 and .8, but I should use recvfrom()
and sendto()
...but all this sounds quite messy, and it envolves a nice refactoring of my code.
How can I keep intact the structure of my proxy (bind, listen, accept
to a fixed port and interface) and increase my line of vision for IP and TCP headers?