I'm a perl novice attempting to perform the following:
1) Take a user input
2) Match the input with instances of that value from column1 of file 1 and store the corresponding value from the column 2 in a hash, hash of array or hash of hash. (below code stores in hash of array but I'm not sure if this is optimal to accomplish 3 below)
3) I need to find all instances (if they exist) of the first column in file 2 = column 2 in file 1.
For simplicity I've provided sample file below.
I'm attempting to take a user input of 'AAA' in column 1 of the input file into a hash or array, as the key for all corresponding values in column 2.
My input file has multiple instances of 'AAA' in column 1 with different values for column 2, also there are multiple instances of 'AAA' and 'BBB' in columns 1 & 2. I believe in order to output this properly I need to use a hash of hash but I'm not sure syntactically how to approach it.
I've tried searching this site and found some examples but I'm afraid I'm only confusing myself more.
Example of input file.
AAA BBB
AAA CCC
AAA BBB
BBB DDD
CCC AAA
Example of my code
#!/usr/bin/perl
use warnings;
use strict;
use diagnostics;
use Data::Dumper;
#declare values
my %hash = ();
#Get protein name from user
print "Get column 1 value: ";
my $value = <STDIN>;
chomp $value;
#open input file
open FILE, "file" or die("unable to open file\n");
while(my $line = <FILE>) {
chomp($line);
my($column1, $column2) = split("\t", $line);
if ($column1 eq $value) {
push @{ $hash{$column1} }, $column2;
}
}
close FILE;
print Dumper(\%hash);
Code output
$VAR1 = {
'AAA' => [
'BBB',
'CCC'
]
};
My question is will my current hash of array setup work best for reading column 1 in file 2 and comparing it with column 2 of file 1? Or should I approach it differently?