Sort a file based on key value pair at a different position in each line

Question

I have a text file that has a bunch of key value pairs. The key value pairs are not in the same order in each line, and only my sequence key is guaranteed to be in each line.

How can I sort the file in linux based on a key value? e.g.

key1=blah key2=something key4=else sequence=3
sequence=1 key2=xlde key7=eldl
blahkey=xxx sequence=2 keyx=adada

I need to sort the file based on the 'sequence' key. I.e.

sequence=1 key2=xlde key7=eldl
blahkey=xxx sequence=2 keyx=adada
key1=blah key2=something key4=else sequence=3

Thanks

score 1 · Accepted Answer · answered Jul 02 '13 at 07:57

If sequence key is garanteed not only to be in each line, but to be unique and its value not to exceed lines count (as in example), you could do the following:

Allocate an array of lines size.
For every line:

2.1 Retrieve sequence number in text by slicing it with "sequence=" and a space.

2.2 Turn number in text into index.

2.3 Put a line into the corresponding cell of a new array.

In Python it would be like this:

lines = [
"key1=blah key2=something key4=else sequence=3",
"sequence=1 key2=xlde key7=eldl",
"blahkey=xxx sequence=2 keyx=adada"
]

new_lines = [""] * len(lines)

for line in lines:
    after_sequence = line.split("sequence=")[1]
    and_before_space = after_sequence.split(" ")[0]
    n = int(and_before_space) - 1
    new_lines[n] = line

print new_lines

score 0 · Answer 2 · answered Jul 02 '13 at 07:12

If I were doing this in perl, I would slurp the entire file in and munge it so that I could sort the original raw lines based on their sequence number. I'm not sure how consistent your file format is, but one perl approach might be:

#!/usr/bin/perl -w

my @data;

# slurp in each line, and tag it by its sequence number
foreach my $line ( <STDIN> )
{
    if ($line =~ /sequence=(\S+)/)
    {
        push @data, { sequence => $1, line => $line };
    } else
    {
        die "unhandled line: $line";  # only if needed
    }
}

# sort the lines by their sequence number into @sorted
my @sorted = sort { $a->{sequence} <=> $b->{sequence} } @data;

# generate the final result by extracting the original lines
# from the sorted array and concatenating them
my $result = join("", map { $_->{line} } @sorted);

# output the sorted result
print $result;

I tried this on your example above and it did the trick. You might massage the die line if you could have "garbage" lines in the input that the script can safely ignore.

Also, if you need to switch between ascending and descending sequence order, you can swap $a and $b in this line:

my @sorted = sort { $a->{sequence} <=> $b->{sequence} } @data;

If the sequence numbers aren't purely numeric, or you want to compare them as strings, change the <=> operator for cmp:

my @sorted = sort { $a->{sequence} cmp $b->{sequence} } @data;

Thanks, removed the die line, but did the trick for me. Cheers! — brercia, Jul 02 '13 at 07:48

Sort a file based on key value pair at a different position in each line

2 Answers2