0

I have the following perl code:

use strict;
use warnings;

my %hash;

open FILE, $ARGV[0];
while (my $line = <FILE>) {
    if ($line =~ /gene_type "protein_coding";/) {
        $line =~ /gene_id "([A-Za-z0-9.]*)"/;
        my $genename = $1;
        my @chomp = split(/\t/, $line);
        my @coordinates = ($chomp[3], $chomp[4]);
        if (!defined $hash{$genename}) {
            push @{$hash{$genename}}, [@coordinates];
            next;
        }
        for my $coord (@{$hash{$genename}}) {
            print $coord."\n";
        }
    }
}

This code creates a hash that contains arrays. I am not able to print the arrays, though. It gives the following error:

Use of uninitialized value $coord[0] in concatenation (.) or string at untitled.pl line 16, <FILE> line 17813.
Use of uninitialized value $coord[0] in concatenation (.) or string at untitled.pl line 16, <FILE> line 17814.
Use of uninitialized value $coord[0] in concatenation (.) or string at untitled.pl line 16, <FILE> line 17815.
Use of uninitialized value $coord[0] in concatenation (.) or string at untitled.pl line 16, <FILE> line 17816.
Use of uninitialized value $coord[0] in concatenation (.) or string at untitled.pl line 16, <FILE> line 17817.

Just printing $coord, without [0] it gives the following:

ARRAY(0xac5b18)
ARRAY(0xac5b18)
ARRAY(0xac5b18)
ARRAY(0xac5b18)

My input file is:

chr1    HAVANA  exon    972861  973010  .   +   .   gene_id "ENSG00000187583.7"; transcript_id "ENST00000379407.4"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "PLEKHN1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "PLEKHN1-004"; exon_number 11; exon_id "ENSE00001386720.1"; level 2; protein_id "ENSP00000368717.2"; tag "basic"; tag "appris_candidate"; tag "CCDS"; ccdsid "CCDS53256.1"; havana_gene "OTTHUMG00000040756.4"; havana_transcript "OTTHUMT00000473255.1";
chr1    HAVANA  CDS 972861  973010  .   +   0   gene_id "ENSG00000187583.7"; transcript_id "ENST00000379407.4"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "PLEKHN1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "PLEKHN1-004"; exon_number 11; exon_id "ENSE00001386720.1"; level 2; protein_id "ENSP00000368717.2"; tag "basic"; tag "appris_candidate"; tag "CCDS"; ccdsid "CCDS53256.1"; havana_gene "OTTHUMG00000040756.4"; havana_transcript "OTTHUMT00000473255.1";
chr1    HAVANA  exon    973500  973640  .   +   .   gene_id "ENSG00000187583.7"; transcript_id "ENST00000379407.4"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "PLEKHN1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "PLEKHN1-004"; exon_number 12; exon_id "ENSE00001371278.1"; level 2; protein_id "ENSP00000368717.2"; tag "basic"; tag "appris_candidate"; tag "CCDS"; ccdsid "CCDS53256.1"; havana_gene "OTTHUMG00000040756.4"; havana_transcript "OTTHUMT00000473255.1";
chr1    HAVANA  CDS 973500  973640  .   +   0   gene_id "ENSG00000187583.7"; transcript_id "ENST00000379407.4"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "PLEKHN1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "PLEKHN1-004"; exon_number 12; exon_id "ENSE00001371278.1"; level 2; protein_id "ENSP00000368717.2"; tag "basic"; tag "appris_candidate"; tag "CCDS"; ccdsid "CCDS53256.1"; havana_gene "OTTHUMG00000040756.4"; havana_transcript "OTTHUMT00000473255.1";

Why is it?

user2979409
  • 773
  • 1
  • 12
  • 23

2 Answers2

4

$coord is a reference to an array. Dereference using the arrow operator:

print $coord->[0], "\n";

More info in perlreftut.

RobEarl
  • 7,862
  • 6
  • 35
  • 50
3

$coord is an array reference. Dereference it to get the actual array:

print "@$coord\n";

Also, when storing the array, you are copying the lexical array @coordinates to an anonymous array. That's not needed, you can store the reference to the array directly, as a new one is created in each iteration of the loop:

push @{ $hash{$genename} }, \@coordinates;
choroba
  • 231,213
  • 25
  • 204
  • 289