1

I'm currently working on a script which takes a file as an input. The input file looks like this:

ECHANTILLON GENOTYPE    
CAN1        genotype1   
CAN2        genotype1   
CAN3        genotype1   
EUG1        genotype2   
EUG2        genotype2   
EUG3        genotype2   
EUG4        genotype2

What I want to do is to create a hash with:

  • a first key as the GENOTYPE column and
  • a second key called "sample" which would point to the "ECHANTILLON" column. Its value would thus give for example CAN1, CAN2...

Here is what my script looks like:

#!/usr/local/perl-5.24.0/bin/perl
use warnings;
use strict;

use Data::Dumper;
use feature qw{ say };
use Getopt::Long; 



my $list_geno;

GetOptions("g|input=s"=>\$list_geno);
my %hash_geno_group;

open(GENOTYPED,"<$list_geno") or die ("Cannot open $list_geno\n");
while (defined(my $l1= <GENOTYPED>)) 
    {
        my @geno_group_infos = split(m/\t/,$l1); 
        next if ($l1 =~ m/^ECHANTILLON/); 
        chomp($l1);
        my $sample_nm = $geno_group_infos[0]; 
        my $sample_geno_group = $geno_group_infos[1]; 
        push @{ $hash_geno_group{$sample_geno_group}{"sample"} },$sample_nm;
        foreach $sample_geno_group (keys (%hash_geno_group)){
            foreach $sample_nm (values %{$hash_geno_group{%{$sample_geno_group}{"sample"}}}){
            print $sample_geno_group, "\t" ,  $sample_nm, "\n";
}
}
}


close(GENOTYPED);
exit;

I tried to check what returns the print of $sample_nm variable but it returns me as an error

Can't use string ("genotype1") as a HASH ref while "strict refs" in use at Test.pl line 27, "GENOTYPED" line 2". 

Can anybody explain me please:

  • Why I do have this type or error;
  • How to get values from the 1st column. I'll need further to store them into another variable in order to compare them with the same values but from another input file. Thanks !
Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
Amy Ndy
  • 35
  • 1
  • 6

1 Answers1

2

Replace the line

foreach $sample_nm (values %{$hash_geno_group{%{$sample_geno_group}{"sample"}}}){

with

foreach $sample_nm (@{$hash_geno_group{$sample_geno_group}{"sample"}}){

$sample_geno_group is a key of the hash %hash_geno_group, ie. a string (genotype1 or genotype2 in your example). But when you do %{$sample_geno_group}, you are dereferencing it as if it were a hash reference, hence the error Can't use string as HASH ref ....

Furthermore, values %{ xxx } should be used to retrieve values of the hash referenced by xxx. But in your case, xxx (ie. $hash_geno_group{$sample_geno_group}{"sample"}) is a reference to an array, in which you inserted elements with the push ... two lines above. So @{ xxx } should be used to retrieve its elements (as I did in the fix I suggested).


Another suggestion: Use a variable declared with my instead of your bareword GENOTYPED (so for instance, open my $genotype, '<', $list_geno or die "Can't open '<$list_geno': $!").

Dada
  • 6,313
  • 7
  • 24
  • 43