-2

I wrote a small program in perl that parses authorization files containing public keys created by OpenSSH's ssh-keygen.

  • Each valid line in the file contains one public key.
  • Empty lines and lines starting with a ‘#’ are skipped.
  • Lines can be extremely long, up to 8kB.

Public keys consist of the following space-separated fields, in order:

  • options [optional] any number of comma-separated case-insensitive option specifications. No spaces are permitted, except within double quotes..
  • keytype [required] keytype is an alphanumeric string that can contain dashes or underscores.
  • base64-encoded key [required] a string containing no spaces
  • comment [optional] a string which may contain spaces

Note, I validate the keytype and base64-encoded key outside the regex in the actual program, for easier future maintenance as key algorithms are added or removed from OpenSSH. The regex just parses the lines.

Medievalist
  • 713
  • 5
  • 8
  • Are the fields required to be in the shown order? How does one distinguish a single option from a keytype, as they are described in the question? Does the encoded key must have `=` in it? – zdim Jan 16 '18 at 22:13
  • @zdim, yes, that is the required order. Keytypes and options vary depending on what version of OpenSSH you are running; for example older versions may not support the `ed25519` keytype or the `restrict` keyword. Base64 uses the `=` sign as padding, so it can only occur at the end of an encoded key, and does not always occur. – Medievalist Jan 17 '18 at 14:39
  • OK. Then I don't see how a program can tell whether the first word is a single option (no comma) or a keytype, or how many keytypes there are before an encoded key. By what you are saying they are no different; they may all be just alphanumeric strings. – zdim Jan 17 '18 at 18:12
  • @zdim, there can only be one keytype per line. But yes, you've found the conundrum; making the first field optional was a very strange design choice. (I'm kind of surprised at the downvotes, given how important OpenSSH is in the computing industry. I think it is an interesting problem!) – Medievalist Jan 18 '18 at 23:16
  • The fact that there can only be one keytype per line, and one out of a given set -- which I got from [PerlDuck answer](https://stackoverflow.com/a/48309005/4653379) -- solves the parsing, as seen in @PerlDuck answer. Without that constraint it would not be possible to parse it, which is why I kept asking. – zdim Jan 18 '18 at 23:38
  • 1
    I'd guess for downvotes (I didn't do it) that it is because the question merely asks, with no code shown, so it is off-topic by site's letter and spirit. Your code posted as answer is something else, while if you had included it in the question it would be more suitable for code review. It's a conundrum: this is an important topic but you do have working code. – zdim Jan 18 '18 at 23:42

2 Answers2

2

The docs for OpenSSH say

The format of authorized_keys is described in the sshd(8) manual page.

The sshd(8) manual page says

AuthorizedKeysFile specifies the files containing public keys […] Each line of the file contains one key (empty lines and lines starting with a ‘#’ are ignored as comments). Public keys consist of the following space-separated fields: options, keytype, base64-encoded key, comment. The options field is optional. The keytype is “ecdsa-sha2-nistp256”, “ecdsa-sha2-nistp384”, “ecdsa-sha2-nistp521”, “ssh-ed25519”, “ssh-dss” or “ssh-rsa”; the comment field is not used for anything (but may be convenient for the user to identify the key). […] The options (if present) consist of comma-separated option specifications. No spaces are permitted, except within double quotes.

If I rely on this documentation and take the possible keytypes literal, then I get the following solution:

#!/usr/bin/env perl

use strict;
use warnings;

my $possible_types = join('|', qw(ecdsa-sha2-nistp256 
                                  ecdsa-sha2-nistp384 
                                  ecdsa-sha2-nistp521 
                                  ssh-ed25519 
                                  ssh-dss 
                                  ssh-rsa));

my $pattern = qr/^(?:(.*)\s+)?                 # optional options
                  ($possible_types) \s+ (\S+)  # mandatory type and key
                  (?:\s+(.*))?$/x;             # optional comment

while( <DATA> ) {
    if (/$pattern/) {
        my ($options, $type, $key, $comment) = ($1 // 'NONE', $2, $3, $4);
        print "options: '$options'\n";
        print "type:    '$type'\n";
        print "key:     '$key'\n";
        print "comment: '$comment'\n";
    } else {
        print "unrecognized line: $_";
    }
    print '-' x 30, "\n";
}

__DATA__
from="*.sales.example.net,!pc.sales.example.net" ssh-rsa AAAAB2...19Q== john@example.net
ssh-ed25519 AAA3Nzsdfsfsd...fdsXhsdfsfWqfw this is a comment
command="dump /home",no-pty,no-port-forwarding ssh-dss AAAAC3...51R== example.net
permitopen="192.0.2.1:80",permitopen="192.0.2.2:25" ssh-dss AAAAB5...21S==
tunnel="0",command="sh /etc/netstart tun0" ssh-rsa AAAA...== jane@example.net
restrict,command="uptime" ecdsa-sha2-nistp521 AAAA1C8...32Tv==
restrict,pty,command="nethack" ssh-rsa AAAA1f8...IrrC5== user@example.net

This works because it takes the six possible keytypes for granted and then searches around: The options (if any) must be the string before the keytype; the keytype is followed by the key (always), and optionally a comment follows.

I don't like this approach, though, because it has the possible keytypes hardcoded. For your input it prints:

options: 'from="*.sales.example.net,!pc.sales.example.net"'
type:    'ssh-rsa'
key:     'AAAAB2...19Q=='
comment: 'john@example.net'
------------------------------
options: 'NONE'
type:    'ssh-ed25519'
key:     'AAA3Nzsdfsfsd...fdsXhsdfsfWqfw'
comment: 'this is a comment'
------------------------------
options: 'command="dump /home",no-pty,no-port-forwarding'
type:    'ssh-dss'
key:     'AAAAC3...51R=='
comment: 'example.net'
------------------------------
options: 'permitopen="192.0.2.1:80",permitopen="192.0.2.2:25"'
type:    'ssh-dss'
key:     'AAAAB5...21S=='
comment: ''
------------------------------
options: 'tunnel="0",command="sh /etc/netstart tun0"'
type:    'ssh-rsa'
key:     'AAAA...=='
comment: 'jane@example.net'
------------------------------
options: 'restrict,command="uptime"'
type:    'ecdsa-sha2-nistp521'
key:     'AAAA1C8...32Tv=='
comment: ''
------------------------------
options: 'restrict,pty,command="nethack"'
type:    'ssh-rsa'
key:     'AAAA1f8...IrrC5=='
comment: 'user@example.net'
------------------------------
PerlDuck
  • 5,610
  • 3
  • 20
  • 39
  • 1
    The man page is mostly right, although it's version specific. It does fail to mention that the comment can have unquoted spaces in it, or be missing entirely, but your code handles that just fine. I like the way you built the regex up by joining an array, that's a really good maintainable idiom. A junior programmer (or a unix greybeard in a hurry) would find this very easy to update when the permissible encryption types inevitably change in the future. – Medievalist Jan 17 '18 at 22:33
  • Hum, I wonder what happens if someone tries `command="echo ssh-rsa aaa= a@b.com" ssh-dss bbb= my ssh-ed25519 ccc= x@y` – MestreLion Jun 03 '23 at 02:59
1

This is what I've come up with, and it works on my test data.

  if (my ($koptions, $ktype, $kbase64, $kcomment) =$_ !~

       /^(?:((?:[!-~]|\s(?=.*"))+)\s+)? # optional key options
         ([a-z0-9_-]+)\s+               # key type followed by a space
         ([\.=a-z0-9\/+_-]+)            # RFC4253 base64 encoded key
         (?:\s+(.*))?$                  # optional comment
       /xxia) {             # ASCII, Case Insensitive, and exploded

    $koptions //= "NONE";
    $kcomment //= "NONE";
    print "\nkoptions are $koptions\n";
    print "ktype is $ktype\n";
    print "kbase64 is $kbase64\n";
    print "kcomment is $kcomment\n";
  } else {
    print "Incomprehensible line in ~/.ssh/authorized_keys!\n"
  }

Example file:

ssh-rsa AAAAB3Nza...LiPk== user@example.net
from="*.sales.example.net,!pc.sales.example.net" ssh-rsa AAAAB2...19Q== john@example.net
ssh-ed25519 AAA3Nzsdfsfsd...fdsXhsdfsfWqfw this is a comment
command="dump /home",no-pty,no-port-forwarding ssh-dss AAAAC3...51R== example.net
permitopen="192.0.2.1:80",permitopen="192.0.2.2:25" ssh-dss AAAAB5...21S==
tunnel="0",command="sh /etc/netstart tun0" ssh-rsa AAAA...== jane@example.net
restrict,command="uptime" ecdsa-sha2-nistp521 AAAA1C8...32Tv==
restrict,pty,command="nethack" ssh-rsa AAAA1f8...IrrC5== user@example.net

My apologies for the style, or lack thereof.

Does anyone have anything more elegant?

Medievalist
  • 713
  • 5
  • 8
  • Thanks, I wasn't familiar with stackexchange codereview until you mentioned it! I did it here so that anyone searching for such a regex would easily find it. – Medievalist Jan 17 '18 at 14:42