4

I'm trying to read the data from HTTP stream with TCP inside Linux kernel. I'm able to get most data from skb_buff here. However, if the server pushes data to the client without requests, the data won't be copied to user space so that I cannot find it any more.

Using Wireshark, I can find the additional data as a single packet normally. Therefore I think these data should go into kernel somewhere, even if they are not requested by the user space. Is it possible to find all the data when they are retrieved from Network Interface like Wireshark did? If so, where should I find them?

Thanks! Any ideas are appreciable.

EDIT: It should be different from another similar question. I even couldn't get the skb instance containing the data I need because the client didn't request it. Therefore such data won't be copied into user space. Thanks for pointing that question to me but I still need to find the correct skb instance first. I suspect I should catch the data somewhere when the data is retrieved from network interface.

Community
  • 1
  • 1
zzy
  • 751
  • 1
  • 13
  • 25
  • Wireshark uses the `pcap` driver. See http://stackoverflow.com/questions/23189078/how-libpcap-receive-a-packet-from-the-driver – Barmar May 27 '16 at 19:52
  • @Barmar Thanks! And do you have any idea where the driver is inserted into the protocol stack? I guess perhaps that's the correct location I'm looking for. – zzy May 28 '16 at 14:31
  • @SamProtsenko Please find my further explanation at EDIT section. – zzy May 28 '16 at 14:32
  • @zzy Ok, I retracted my close vote. Please provide some actual code where things don't work for you. And I still don't understand where exactly your code operates -- in kernel-space or user-space? – Sam Protsenko May 28 '16 at 15:52
  • @SamProtsenko Thanks! I intend to work in kernel and my code simply print out data here: http://lxr.free-electrons.com/source/net/core/datagram.c?v=3.11#L319. In more detail, I print out the content that is copied to the user space by `memcpy_toiovec()`. However, when the server sends data without request, the data didn't go through here and I'm looking for where I can find such data inside kernel. – zzy May 29 '16 at 15:11
  • @zzy I can't answer your question directly, but I can tell you next. Adding your code in `skb_copy_datagram_iovec()` is not only intrusive (and hence bad) way to do things, it's also a rather unstable solution. Take a look at [this](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d3a9632f09153bc46a8077844e05e179f1c10c3f) commit. It completely removes `skb_copy_datagram_iovec()` from kernel code base. Means that at some point (new kernel version) you will not get it. So why not use something like netfilter hook to do the same work? Is it not sufficient in your case? – Sam Protsenko May 29 '16 at 15:29
  • @SamProtsenko Thanks for the suggestion! I'm not familiar with netfilter hook and not sure whether it is sufficient for my case at this moment. But it seems like it's a fantastic direction to explore! – zzy May 30 '16 at 19:38
  • @zzy It seems like [XY question](http://xyproblem.info/). Maybe we can point you more suitable way if you tell us what you are trying to achieve in the end. – Sam Protsenko May 30 '16 at 19:51
  • @SamProtsenko Yes that makes sense. My ultimate goal is to find certain packet data in the kernel, and use netlink to send out these data to a certain application so that such data won't be touched by other applications. Therefore I chose to grab the content before it reached user space, or at the function `skb_copy_datagram_iovec()`. But I find I may not be able to grab all data there. Hope this is more clear. Thanks! – zzy May 30 '16 at 20:14
  • @zzy Ok, now it's more clear. Seems like netfilter hook with `NF_INET_PRE_ROUTING` should be used. You mentioned that you are unable to get some data using [this code](http://stackoverflow.com/a/29584449/3866447), and that is interesting, because you should get all incoming data by that hook. Please create another question on SO, but ask about how to achieve your ultimate goal, without any suggestions. And also go into more details about which packets you are unable to get and why (as it is not completely clear). – Sam Protsenko May 30 '16 at 21:15
  • @SamProtsenko Thanks! I tried to print all the packets in that code (http://stackoverflow.com/questions/29553990/print-tcp-packet-data/29584449#29584449) without any conditions (eg. TCP, HTTP) and I finally got the data I want. It's kind of strange because in wireshark, I can get the packet as TCP packet but in this code, I got it as UDP packet... But anyway, I can get the data now. Thanks so much for your patient help!!! – zzy May 31 '16 at 00:49

0 Answers0