Alphanumeric sort using perl

Question

@arr = qw(test1 test3 tes5 test2 test4 test8 test6 test7 test10 test9);

How can I sort this array and get the output like:

test1 test2 test3 test4 test5 test6 test7 test8 test9 test10

instead of

test1 test10 test2 test3 test4 test5 test6 test7 test8 test9

Possible duplicate of [How to do alpha numeric sort perl?](http://stackoverflow.com/questions/11102518/how-to-do-alpha-numeric-sort-perl) — CDahn, Mar 19 '16 at 20:18

Zaid · Answer 1 · 2016-03-18T13:42:36.420

9

Why reinvent the wheel?

Just use Sort::Naturally:

use strict;
use warnings;
use feature 'say';
use Sort::Naturally 'nsort';

my @test = map 'test'.$_, reverse 1..10;
say for nsort @test;

prints

test1
test2
test3
test4
test5
test6
test7
test8
test9
test10

edited Mar 18 '16 at 13:42

answered Mar 18 '16 at 13:28

Zaid

36,680
16
86
155

This would have been a better answer if it didn't include "Why invent the wheel?" and replaced the word "Just" with "You can" – Mark Jul 13 '22 at 16:18

Zbynek Vyskovsky - kvr000 · Answer 2 · 2016-03-18T14:24:24.207

1

Split the string inside sort comparator and check one element after another. $ta stands for text from a, $na is the number and $ra is the rest of the string (if there is any):

@result = sort({
        my ( $ta, $na, $ra ) = $a =~ m/^(.*?)(\d+)(.*)$/;
        my ( $tb, $nb, $rb ) = $b =~ m/^(.*?)(\d+)(.*)$/;
        return $ta cmp $tb || $na <=> $nb || $ra cmp $rb;
    }
    @arr
);

edited Mar 18 '16 at 14:24

answered Mar 18 '16 at 13:25

Zbynek Vyskovsky - kvr000

18,186
3
35
43

1

Your postfix if-statements are not needed if you use the idiomatic way to apply sorting comparisons: `$ta cmp $tb || $na <=> $nb || $ra cmp $rb` – TLP Mar 18 '16 at 14:16

fugu · Accepted Answer · 2016-03-18T14:08:11.170

1

Use a custom sort subroutine:

my @arr = qw(test1 test3 tes5 test2 test4 test8 test6 test7 test10 test9);

foreach ( sort { number_strip($a) <=> number_strip($b) } @arr ){
    say;
}

sub number_strip {
    $line = shift;
    my ($num) = $line =~ /(\d+)/;
    return $num;
}

test1
test2
test3
test4
tes5
test6
test7
test8
test9
test10

edited Mar 18 '16 at 14:08

answered Mar 18 '16 at 13:29

fugu

6,417
5
40
75

1

`.*?` is completely redundant. – TLP Mar 18 '16 at 14:01
You can also do it like this: `say for sort { $a =~ s/\D//gr <=> $b =~ s/\D//rg } @arr;` – Sobrique Mar 18 '16 at 14:48
@Sobrique Except that this will not work for all strings, e.g. `test5foo1` would erroneously become `51`. – TLP Mar 18 '16 at 15:06
This will do a strange thing with a string like `necrophilia5`. Doesn't the content of the values suggest that they are, well, *test* data? – Borodin Mar 18 '16 at 16:48
Also a lot neater to `use List::UtilsBy qw( nsort_by );` so you can just call `nsort_by { number_strip($_) } @arr` – LeoNerd Mar 18 '16 at 17:03

sotona · Answer 4 · 2016-03-21T07:08:35.963

0

my @sorted = map { s/(^|\D)0+(\d)/$1$2/g; $_          } sort
             map { /(\d+)/sprintf("%06.6d",$1)/ge; $_ } @arr;

will sort your array naturally. Also I noticed the fact there's no test5 element in your initial description, but tes5 instead

to be explained this could be divided into parts. since Perl interpreter crawls expressions right to left first action is

my @sorted = map { s/(\d+)/sprintf("%06.6d",$1)/ge } @arr;

here we add zeroes between text and numbers to have 6 decimal places (this is a random number, may be any that satisfies current task)

then we sort the array lexically

@sorted = sort @sorted

an then remove inserted zeroes

@sorted = map { s/(^|\D)0+(\d)/$1$2/g } @sorted;

among main caveats - this will break elements like 'test04'

edited Mar 21 '16 at 07:08

answered Mar 18 '16 at 13:50

sotona

1,731
2
24
34

how will it shorten the notation? – sotona Mar 18 '16 at 14:08
Its not about making it shorter, its about using the right tools. – TLP Mar 18 '16 at 14:08
but why `map` is *righter* than `grep`? in't `grep` a part of Perl core? where is the abuse of `grep`? – sotona Mar 18 '16 at 14:10
1

You can use a shovel to hammer in a nail, that does not mean that using a shovel is the right way to do it. `grep` is made to remove items from a list, or to verify some quality. You are using it to alter values, and you are explicitly adding a `1` to not make it filter out items. – TLP Mar 18 '16 at 14:11
The first part will remove zeroes regardless of whether you added them or not. You might consider using a Schwartzian transform instead. – TLP Mar 18 '16 at 14:13
Since this is a learning environment, teaching people the wrong thing, and not explaining what you are doing, is a bad thing to do. As such, if you do not update your answer, I will feel obligated to downvote it. – TLP Mar 18 '16 at 14:20
this will never work as it is after you edited my answer – sotona Mar 18 '16 at 16:23
It was either that or downvote you, and I opted to do this. You're right, it doesn't work, because `map` needs a return value, so you need to add `; $_`. That's a pitfall of correcting someone else's flawed logic. If you want, I can rollback the changes and downvote instead. – TLP Mar 18 '16 at 18:32
Well, if we talk about learning process, downvote me and post more righteous variant would be more consistent than making edits with such 'pitfalls', because we're not the only participants of this thread, as you can see. – sotona Mar 21 '16 at 07:07

Borodin · Answer 5 · 2016-07-07T11:08:24.070

If you'd prefer to avoid the module then you can write a sort block that splits each string into an alpha part and a numeric part and compares them separately.

But this relies on the format always being AAA999 and I'd prefer Sort::Naturally anyway in case the format changes in the future

Note that your sample data has a spurious tes5, which I assume was a typo but still qualifies as test data. It is sorted before all the other values because tes sorts before test lexically

use strict;
use warnings 'all';
use feature 'say';

my @arr = qw(test1 test3 tes5 test2 test4 test8 test6 test7 test10 test9 );

say for sort {
    my ($aa, $bb) = map [ /([a-z]+)(\d+)/i ], $a, $b;
    $aa->[0] cmp $bb->[0] or $aa->[1] <=> $bb->[1];
} @arr;

output

tes5
test1
test2
test3
test4
test6
test7
test8
test9
test10

Alphanumeric sort using perl

5 Answers5

Why reinvent the wheel?

output