5

I'm trying to use perl to parse some pseudo files from /proc and /sys linux pseudo filesystems (procfs and sysfs). Such files are unlike regular files - they are implemented by custom file operation handlers. Most of them have zero size for stat, some can't be open for read, other can't be written. Sometimes they are implemented incorrectly (which is error, but it is already in the kernel) and I still want to read them directly from perl without starting some helper tools.

There is quick example of reading /proc/loadavg with perl, this file is implemented correctly:

perl -e 'open F,"</proc/loadavg"; $_=<F>; print '

With strace of the command I can check how perl implements open function:

$ strace perl -e 'open F,"</proc/loadavg"; $_=<F>; print ' 2>&1 | egrep -A5 ^open.*loadavg

open("/proc/loadavg", O_RDONLY)         = 3
ioctl(...something strange...)    = -1 ENOTTY
lseek(3, 0, SEEK_CUR)                   = 0
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0

There is lseek system call used by open perl function.

With strace of cat /proc/loadavg there was no extra seek-typed system calls:

$ strace cat /proc/loadavg 2>&1 | egrep -A2 ^open.*loadavg
open("/proc/loadavg", O_RDONLY)         = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0

The special file I want to read (or write) misimplement seek file operations and will not give any useful data to read (or to write) syscall after seek.

Is there way of opening files for reading in perl5 (no external modules) without calling extra lseek? (and without using system("cat < /proc/loadavg"))

Is there way of opening files for writing in perl5 without calling extra lseek?

There is sysopen, but it does extra lseek too: perl -e 'use Fcntl;sysopen(F,"/proc/loadavg",O_RDONLY);sysread(F,$_,2048); print '

osgx
  • 90,338
  • 53
  • 357
  • 513
  • 2
    On the linux box on which I tried it, `open F, "<:unix", "/proc/loadavg"` does just `fstat` and set close on exec – ikegami Nov 02 '16 at 06:03
  • @ikegami, What kind of black magic (`<:unix`) is used in your answer (doc or source link may be useful)? It didn't seek on my test linux too. – osgx Nov 02 '16 at 06:25
  • 1
    As mentioned in the documentation for [`open`](http://perldoc.perl.org/functions/open.html), PerlIO layers are documented in [PerlIO](http://perldoc.perl.org/PerlIO.html). – ikegami Nov 02 '16 at 07:06
  • Check out `perl -MPerlIO -E'open F, "<", $ARGV[0] or die $!; say for PerlIO::get_layers(\*F);' ~/.bashrc`, then try again specifying `:unix`. – ikegami Nov 02 '16 at 07:09
  • @ikegame, there is still `fstat` with `my $SET,">:unix",".../driver_command; print $SET $COMMAND."\n"; close $SET`; can I disable it? My script still not able to communicate with driver; but plain `echo SAME_COMMAND > .../driver_command` works, I see only extra fstat here. – osgx Nov 14 '16 at 02:25

2 Answers2

3

As you've noticed, Perl's builtin open masks quite a bit of magic. If that magic gets in your way, there is sysopen and POSIX::open() that offer decreasing degrees of magic. POSIX::open() is sufficiently non-magical that it returns file descriptors rather than Perl filehandles, and you have to use POSIX::read() instead of the normal Perl operators to get data from it. If that's not raw enough for your circumstances, you may be out of luck.

The POSIX module is a part of the core perl distribution since the very first release of Perl 5, so if you don't have it your Perl installation is crippled.

Calle Dybedahl
  • 5,228
  • 2
  • 18
  • 22
  • Calle, please, add some links to documentation. ikegami recommended `<:unix` black magic option for open, can you add it as other alternative solution to your answer? – osgx Nov 03 '16 at 10:39
  • Calle, thanks, POSIX::open + read works without any extra syscalls: `perl -e 'use POSIX (); $fd=POSIX::open("/proc/loadavg");$bytes = POSIX::read( $fd, $buf, 30 ); POSIX::close($fd);print $bytes." read: ".$buf;' with result `27 read: 0.16 0.09 0.02 ....` – osgx Nov 24 '16 at 20:40
  • There is '\0' special char in end of the read. How can I parse multiline from POSIX::read, there is no `while()` now. – osgx Nov 24 '16 at 20:57
2

If you wanna go really low-level and avoid even the mmap from POSIX::open() (and also avoid loading the huge POSIX module), perform the syscall()s yourself. Probably want to require syscall.ph to get the values of SYS_open and SYS_read if you don't know them (for me I know read, write, and open are 0,1,2, respectively - this is important to know for the syscall function below).

The following code:

strace perl -mPOSIX -e'$fd=POSIX::open("/proc/loadavg");POSIX::read($fd, 
$_, 9999);' 2>&1 | egrep -A2 '^open.*loadavg'

gives something like (for me open() is openat())

openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 3
mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =             
0x7f2389f0d000
read(3, "1.22 2.51 1.54 3/206 18145\n", 9999) = 27

Try something like this:

strace perl -MFcntl -E'$p="/proc/loadavg"; $fd=syscall 2, $p, O_RDONLY; $bf = 
"\0"x50; syscall 0, $fd, $bf, 50' 2>&1 | egrep -A1 '^open.*loadavg'

and get:

open("/proc/loadavg", O_RDONLY) = 3
read(3, "0.45 0.18 0.20 2/241 12349\n", 50) = 27

EDIT:
Regarding the comment from @osgx,

There is \0 special char in end of the read. How can I parse multiline from POSIX::read, there is no while(<FILE>) now.

Note that when you 'while(<FILE>)' you are really just calling read() one byte at a time and checking for the '\n' char--or whatever your $/ (input record seperator) is set to (you can confirm this via strace).

Thus, a simple loop to check for $/ could suffice. (Note that read() returns, on success, the number of bytes read (0 indicates EOF). Here is a crude example of a single "readline":

require 'syscall.ph';
require Fcntl;
my($path, $fd, $buf, $res);
$path = '/proc/meminfo';
$fd = syscall SYS_open(), $path, O_RDONLY;
$buf = ' ';
$res = '';
$res .= $buf while syscall SYS_read(), $fd, $buf, 1 and $buf ne $/;
syscall SYS_close(), $fd; # optional in this case

Just be aware that if you are going for portability of any sort, syscall-ing is probably one of the worst choices, but that's the price of specificity. (POSIX::open/read/close() isn't better by much in this sense either.). To maintain portability, you might be better off using @ikegami's comment (<:unix), and ignore the extra calls to fstat and fcntl;

YenForYang
  • 2,998
  • 25
  • 22
  • "read, write, and open are 0,1,2" Are they always 0,1, and 2, for any of linux platform? – osgx May 15 '18 at 13:49
  • 1
    Nope; using syscalls directly isn't the most portable. Try `require syscall.ph` to find out what the `SYS_read`, `SYS_write`, `SYS_open` constants are for you. Sidenote: I didn't method anything about closing the file descriptor. You _can_ use `POSIX::close()`, but you could also use `syscall(SYS_close,$fd)`,(which is `syscall(3,$fd)` for me). – YenForYang May 16 '18 at 02:19