Right now, I read one character at a time in a loop, until I reach the \0
character. Is there a better way to do this?

- 12,111
- 21
- 91
- 136

- 8,641
- 11
- 45
- 45
5 Answers
Set your line ending to \x{00}
(\0), be sure to localise it, and getline
on the handle, like so:
{
local $/ = "\x{00}";
while (my $line = $sock->getline) {
print "$line\n"; # do whatever with your data here
}
}

- 3,046
- 22
- 16
You could use FIONREAD
with ioctl
. The program below connects to the SSH server on localhost and waits on its greeting:
#! /usr/bin/perl
use warnings;
use strict;
use subs 'FIONREAD';
require "sys/ioctl.ph";
use Socket;
socket my $s, PF_INET, SOCK_STREAM, getprotobyname "tcp"
or die "$0: socket: $!";
connect $s, sockaddr_in 22, inet_aton "localhost"
or die "$0: connect: $!";
my $rin = "";
vec($rin, fileno($s), 1) = 1;
my $nfound = select my$rout=$rin, "", "", undef;
die "$0: select: $!" if $nfound < 0;
if ($nfound) {
my $size = pack "L", 0;
ioctl $s, FIONREAD, $size
or die "$0: ioctl: $!";
print unpack("L", $size), "\n";
sysread $s, my $buf, unpack "L", $size
or die "$0: sysread: $!";
my $length = length $buf;
$buf =~ s/\r/\\r/g;
$buf =~ s/\n/\\n/g;
print "got: [$buf], length=$length\n";
}
Sample run:
$ ./howmuch 39 got: [SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu4\r\n], length=39
But you'll probably prefer using the IO::Socket::INET
and IO::Select
modules as in the code below that talks to Google:
#! /usr/bin/perl
use warnings;
use strict;
use subs "FIONREAD";
require "sys/ioctl.ph";
use IO::Select;
use IO::Socket::INET;
my $s = IO::Socket::INET->new(PeerAddr => "google.com:80")
or die "$0: can't connect: $@";
my $CRLF = "\015\012";
print $s "HEAD / HTTP/1.0$CRLF$CRLF" or warn "$0: print: $!";
my @ready = IO::Select->new($s)->can_read;
die "$0: umm..." unless $s == $ready[0];
my $size = pack "L", 0;
ioctl $s, FIONREAD, $size
or die "$0: ioctl: $!";
print unpack("L", $size), "\n";
sysread $s, my $buf, unpack "L", $size
or die "$0: sysread: $!";
my $length = length $buf;
$buf =~ s/\r/\\r/g;
$buf =~ s/\n/\\n/g;
print "got: [$buf], length=$length\n";
Output:
573 got: [HTTP/1.0 200 OK\r\nDate: Sun, 18 Jul 2010 12:03:48 GMT\r\nExpires: -1\r\nCache-Control: private, max-age=0\r\nContent-Type: text/html; charset=ISO-8859-1\r\nSet-Cookie: PREF=ID=6742ab80dd810a95:TM=1279454628:LM=1279454628:S=ewNg64020FbnGzHR; expires=Tue, 17-Jul-2012 12:03:48 GMT; path=/; domain=.google.com\r\nSet-Cookie: NID=36=kn2wtTD4UJ3MYYQ5uvA4iAsrS2wcrb_W781pZ1hrVUhUDHrIJTMg_kOgVKhjQnO5SM6MdC_jrRdxFRyXwyyv5N3Xja1ydhVLWWaYqpMHQOmGVi2K5qRWAKwDhCVRd8WS; expires=Mon, 17-Jan-2011 12:03:48 GMT; path=/; domain=.google.com; HttpOnly\r\nServer: gws\r\nX-XSS-Protection: 1; mode=block\r\n\r\n], length=573

- 134,834
- 32
- 188
- 245
-
`sysread` returns as soon as there is any data available, so we can skip `FIONREAD` and just call `sysread` with a large size. – Sam Watkins Sep 22 '16 at 07:32
What is the best way to receive data from a socket in Perl, when the data length is unknown?
A sound solution to this is impossible, in any language. If you don't know how long the data length is, then you can't possibly know when you've finished receiving all of it from the socket.
Your only hope is to use some kind of a metric to determine if it's been "long enough" since data started coming in, to make the decision that data flow has stopped. But it won't be perfect.

- 27,575
- 16
- 91
- 128
-
-
2As long as you know *for sure* that character can't be sent as part of your data stream, then it functions like an End-Of-Data marker, in which case you don't need to know the data length. In which case your solution is valid. – Shaggy Frog Jul 18 '10 at 09:20
-
The answer depends on the protocol. Since your protocol uses '\0' as a separator, you're doing the right thing. I'm pretty sure Perl handles buffering for you, so reading one character at a time is not inefficient.
Many network oriented protocols precede strings with a length. To read a protocol like this, you read the length (usually one or two bytes, depending on the protocol spec), then read that many bytes into a string.

- 40,215
- 13
- 94
- 127
-
1PerlIO certainly does handle buffering, so 1-char reads don't incur a *syscall* overhead, but they still waste time in the Perl op loop (not to mention the number of string concatenations that might be happening, depending on the code). Not to micro-optimize, but the `$/` + `getline` approach is far more efficient and abundantly clear, so it wins :) – hobbs Jul 18 '10 at 10:34
You can use sysread
to read whatever data is available:
my $data;
my $max_length = 1000000;
sysread $sock, $data, $max_length;
Perl's read
function waits for the full number of bytes that you requested, or EOF.
This is similar to libc stdio fread(3)
.
Perl's sysread
function returns as soon as it receives any data.
This is similar to UNIX read(2)
.
Note that sysread
bypasses buffered IO, so don't mix it with the buffered read
.
Check perldoc -f read
and perldoc -f sysread
for more info.
For this specific question, it would be better to follow the top answer, and use getline
with a line-ending of \0
, but we can use sysread
if there is no terminating character.
Here's a little example. It requests a web page, and prints the first chunk of data received.
#!/usr/bin/perl -w
use strict; use warnings;
use IO::Socket;
my $host = $ARGV[0] || 'google.com';
my $port = $ARGV[1] || 80;
my $sock = IO::Socket::INET->new(Proto => 'tcp', PeerAddr => $host, PeerPort => $port)
or die "connect failed: $!";
$sock->autoflush(1);
# use HTTP/1.1, which keeps the socket open by default
$sock->print("GET / HTTP/1.1\r\nHost: $host\r\n\r\n");
my $reply;
my $max_length = 1000000;
# $sock->read($reply, $max_length); # read would hang waiting for 1000000 bytes
my $count = $sock->sysread($reply, $max_length);
if (!defined $count) {
die "read failed: $!";
}
print $reply;

- 7,819
- 3
- 38
- 38