I'm somewhat new to perl programming and I've got a hash which could be formulated like this:
$hash{"snake"}{ACB2} = [70, 120];
$hash{"snake"}{SGJK} = [183, 120];
$hash{"snake"}{KDMFS} = [1213, 120];
$hash{"snake"}{VCS2} = [21, 120];
...
$hash{"bear"}{ACB2} = [12, 87];
$hash{"bear"}{GASF} = [131, 87];
$hash{"bear"}{SDVS} = [53, 87];
...
$hash{"monkey"}{ACB2} = [70, 230];
$hash{"monkey"}{GMSD} = [234, 230];
$hash{"monkey"}{GJAS} = [521, 230];
$hash{"monkey"}{ASDA} = [134, 230];
$hash{"monkey"}{ASMD} = [700, 230];
The structure of the hash is in summary:
%hash{Organism}{ProteinID}=(protein_length, total_of_proteins_in_that_organism)
I would like to sort this hash according to some conditions. First, I would only like to take into consideration those organisms with a total number of proteins higher than 100, then I would like to show the name of the organism as well as the largest protein and its length.
For this, I'm going for the following approach:
foreach my $org (sort keys %hash) {
foreach my $prot (keys %{ $hash{$org} }) {
if ($hash{$org}{$prot}[1] > 100) {
@sortedarray = sort {$hash{$b}[0]<=>$hash{$a}[0]} keys %hash;
print $org."\n";
print @sortedarray[-1]."\n";
print $hash{$org}{$sortedarray[-1]}[0]."\n";
}
}
}
However, this prints the name of the organism as many times as the total number of proteins, for instance, it prints "snake" 120 times. Besides, this is not sorting properly because i guess I should make use of the variables $org and $prot in the sorting line.
Finally, the output should look like this:
snake
"Largest protein": KDMFS [1213]
monkey
"Largest protein": ASMD [700]