8

Hi sorry if this is a stupid question (I just started learning network programming), but I've been looking all over google about how files/data are divided into packets. I've read everywhere that somehow files are broken up into packets have headers/footers applied as they go through the OSI model and are sent through the wire where the recipient basically does the reverse and removes the headers.

My question is how exactly are files/data broken up into packets and how are they reassembled at the other end?

How does whatever doing the reassembling know when the last packet of the data has arrived and etc?

Is it possible to reassemble packets captured from another machine? And if so how?

(Also if it means anything I'm mostly interested in how this work for TCP type packets)

I also have some packets captured from an application on my computer through WireShark, they're labeled as TCP protocol, what I want to do is reassemble them back into the original data, but how can you tell which packets belong to which set of data?

Any pointers towards resources is much appreciated, thank you!

Edgepad
  • 83
  • 1
  • 5

3 Answers3

5

My question is how exactly are files/data broken up into packets

What's being sent over a network isn't necessarily a file. In the cases where it is a file, there are several different protocols that can send files, and the answer to the question depends on the protocol.

For FTP and HTTP, the entire contents of the file is probably being sent as a single data stream over TCP (preceded by headers in the case of HTTP, and just raw, over the connection, in the case of FTP).

For TCP, there's a "maximum segment size" negotiated by the client and server, based on factors such as the maximum packet size on the various networks between the server and client, and the file data is sent, sequentially, in chunks whose size is limited by the maximum packet size and the size of IP and TCP headers.

For remote file access protocols such as SMB, NFS, and AFP, what goes over the wire are "file read" and "file write" requests; the reply to a "file read" request includes some reply headers and, if the read is successful, the chunk of file data that the read request asked for, and a "file write" request includes some request headers and the chunk of file data being written. Those are not guaranteed to be an entire file, in order, but if the program reading or writing the file is reading or writing the entire file in sequential order, the entire file's data will be available. The packet sizes will depend on the size of the read reply/write request headers and on the read or write size being used; those packets might be broken into multiple TCP segments, based on the TCP "maximum segment size" and the size of the IP and TCP headers.

My question is how exactly are files/data broken up into packets

For FTP, the recipient of the data knows that there is no more data when the side of the TCP connection over which the data is being transmitted is closed.

For HTTP, the recipient of the data knows that there is no more data when the side of the TCP connection over which the data is being transmitted is closed or, if the connection is "persistent" (i.e., it remains open for more requests and replies), when the amount of data specified by the "Content-Size:" header, sent before the data, has been transmitted (or other similar mechanisms, such as the "last chunk" indication for chunked encoding).

For file access protocols, there's no real "we're at the end of data" indication; the closest approximation, for SMB, AFP, and NFSv4, is a "file close" operation.

Is it possible to reassemble packets captured from another machine? And if so how?

It depends on the protocol, but, for HTTP and SMB, if the capture has been read into Wireshark (and all the file data is in the capture!), you can use the "Export Objects" menu, and, for some protocols, you might also be able to use tcpflow.

  • I also have some packets captured from an application on my computer through WireShark, they're labeled as TCP protocol, what I want to do is reassemble them back into the original data, but how can you tell which packets belong to which set of data? (Edited above question) – Edgepad Aug 21 '13 at 17:30
  • "I also have some packets captured from an application on my computer through WireShark, they're labeled as TCP protocol" That means that Wireshark doesn't understand the protocol. Without understanding the protocol, you can't tell what any of the contents of the packets mean, or whether they're transferring a file *at all*, much less, if they're transferring a file or another such chunk of data, which data in the packets belongs to which set of data. –  Aug 21 '13 at 17:49
  • If that is the case, then why is the data semi readable in Wireshark (in ASCII) and what exactly is the "follow TCP stream" option doing when you right click a packet in Wireshark? – Edgepad Aug 21 '13 at 20:09
  • "then why is the data semi readable in Wireshark (in ASCII)" Because that doesn't require any understanding - and note that you said "*semi*-readable"; the hex/ASCII dump pane has a whole bunch of other stuff in it, which is *NOT* part of the data. –  Aug 21 '13 at 21:58
  • "and what exactly is the "follow TCP stream" option doing when you right click a packet in Wireshark?" Putting raw bytes, as transferred over TCP, into the window, whether those bytes are data or message headers. –  Aug 21 '13 at 22:00
  • what about udp packets? – Yi Lin Liu Apr 22 '19 at 19:52
  • Are this data division like those made in html by tags? – Jhon Oliver Aug 23 '20 at 21:58
1

My question is how exactly are files/data broken up into packets and how are they reassembled at the other end?

They are basically just chopped up. Each internet packet (with header info add) can only hold a few hundred bytes of actual data.

How does whatever doing the reassembling know when the last packet of the data has arrived and etc?

For a transfer the packets are numbered, so the receiving process knows how to put them together. If it loses a packet, it can request a resend.

Is it possible to reassemble packets captured from another machine? And if so how?

I don't understand the question. How would you get these packets unless you were a man-in-the-middle?

These answers are true for TCP packets.

Jiminion
  • 5,080
  • 1
  • 31
  • 54
  • For the last question I meant if I captured the packets using something like libpcap would it be possible to reassemble them into the original file? Is there a uniform process for all TCP packets to do this? – Edgepad Aug 20 '13 at 21:31
  • Yes, that would be possible. As long as you captured them all. – Jiminion Aug 21 '13 at 03:02
0

First determine what size you want to transmit.

then put header, data and footer for each transmission.

See buffer length and data array should be divisible by number of packets without giving fractions.

Here header should contain frame number, time stamp, packet number

payload data

footer ---your company information.

prepare data fragments before sending