0

I'm debugging my application (sort of a follow up to an earlier question), which is essentially a toy peer to peer client. It works as follows:

  • Peer 1 requests a block (or several blocks) from Peer 2
  • Peer 2 receives the request, and sends the blocks back

And the cycle more or less repeats. This works great for smaller files, but with any file that has to be split into a larger number of chunks (say 250 chunks of 512 bytes) it dies.

Running strace on Peer 2 (the one that receives the requests) looks like so:

....    
[pid 11731] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f03d0ac9000
[pid 11731] lseek(400, 200704, SEEK_SET) = 200704
[pid 11731] read(400, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024) = 1024
[pid 11731] read(400, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
[pid 11731] sendto(5, "SF\0\0\1\212\0\0\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 522, 0, NULL, 0) = 522
[pid 11731] select(6, [4 5], NULL, NULL, NULL) = 1 (in [5])
[pid 11731] recvfrom(5, "BB\0\0\0\t\0\0\1\213\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, NULL) = 300
[pid 11731] open("test.dat", O_RDONLY)  = 401
[pid 11731] fstat(401, {st_dev=makedev(8, 4), st_ino=9187328, st_mode=S_IFREG|0644, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=20480, st_size=10485760, st_atime=2012/10/03-10:25:29, st_mtime=2012/10/03-10:25:34, st_ctime=2012/10/03-10:25:34}) = 0
[pid 11731] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f03d0ac8000
[pid 11731] lseek(401, 200704, SEEK_SET) = 200704
[pid 11731] read(401, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1536) = 1536
[pid 11731] read(401, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
[pid 11731] sendto(5, "SF\0\0\1\213\0\0\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 522, 0, NULL, 

And the results of strace on Peer 1 (the one that sends the requests) looks like so:

....
[pid 11741] sendto(5, "BB\0\0\0\t\0\0\5\213\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, 0) = 300
[pid 11741] sendto(5, "BB\0\0\0\t\0\0\5\214\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, 0) = 300
[pid 11741] sendto(5, "BB\0\0\0\t\0\0\5\215\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, 0) = 300
[pid 11741] sendto(5, "BB\0\0\0\t\0\0\5\216\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, 0) = 300
[pid 11741] sendto(5, "BB\0\0\0\t\0\0\5\217\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, 0) = 300
[pid 11741] sendto(5, "BB\0\0\0\t\0\0\5\220\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, 0) = 300
[pid 11741] sendto(5, "BB\0\0\0\t\0\0\5\221\0\0\2\0test.dat\0\0\0\0\0\0\0\0\0\0"..., 300, 0, NULL, 0

Both die when doing sends. I'm not entirely sure why. If anyone can shed some light on this I'd really appreciate it!

Community
  • 1
  • 1
the_man_slim
  • 1,155
  • 2
  • 11
  • 18

1 Answers1

0

Your sends are blocking because the peer isn't reading them. That causes the peer's receive buffer to fill, which causes the senders's send buffer to fill, which causes send() to block.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • That makes sense... is there an easy way to flush the send buffer or am I better off just restructing things? – the_man_slim Oct 04 '12 at 01:06
  • @the_man_slim You are better of reading what the peer is sending to you, or else not sending it at all, and indeed not requesting it at all if you aren't going to read it. There is only one `recvrom()` in all those traces. There should be about the same number as there are `sendto()`s. – user207421 Oct 04 '12 at 01:43