1

I'm using BioPerl module to obtain a string from a set of parameters. I followed the HOWTO:Beginners page. The module apparently returns a hash object. How do I get the actual string out of the hash object?

use Bio::DB::GenBank;
use Data::Dumper;

my $gb = Bio::DB::GenBank->new(-format     => 'Fasta',
                             -seq_start  => 1,
                             -seq_stop   => 251,
                             -strand     => 1
                             -complexity => 1);
my $seq = $gb->get_Seq_by_acc('NG_016346');
my $sequence_string = lc($seq->seq());
my $seq_obj = Bio::Seq->new(-seq => $sequence_string,
                          -alphabet => 'dna' );
my $prot_obj = $seq_obj->translate;
print Dumper($prot_obj);

The data dumper prints the following:

$VAR1 = bless( {
             'primary_seq' => bless( {
                                       'length' => 83,
                                       '_root_verbose' => 0,
                                       '_nowarnonempty' => undef,
                                       'seq' => 'RLCVKEGPWPAVEGTWSWG*HRPGSRACPRWGAPNSVQATSYTPSPTHAPFSVSPIPIC*MSLLEASCWPGSREDGARMSAGM',
                                       'alphabet' => 'protein'
                                     }, 'Bio::PrimarySeq' ),
             '_root_verbose' => 0
           }, 'Bio::Seq' );

How do I obtain 'seq' that is stored in $prot_obj?

I tried

print $prot_obj{'primary_seq'}{'seq'};

but it doesn't print anything. Data dumper printed the word bless. Maybe seq is a field of an object oriented variable.

cooldood3490
  • 2,418
  • 7
  • 51
  • 66

3 Answers3

3

The correct format for accessing object properties uses ->:

print $prot_obj->{'primary_seq'}->{'seq'};
doublesharp
  • 26,888
  • 6
  • 52
  • 73
  • Only first `->` is required. The usage `->` after each {} (hashref), [] (arrayref ) is the old syntax. – Eugen Konkov Oct 27 '15 at 21:20
  • It's not necessarily old syntax, it's just that perl knows that anything nested within the ref has to be another ref, and because there can never be ambiguity, the latter dereference ops aren't required – stevieb Oct 27 '15 at 21:59
  • 4
    I'm afraid I'm not convinced this is the right thing to do. The point of OO is to encapsulate, and so poking directly at nested object attributes is bad form. – Sobrique Oct 27 '15 at 22:06
  • I definitely agree with @Sobrique. I did look briefly at the module, but didn't find an accessor with the time I had. – stevieb Oct 27 '15 at 22:32
  • 2
    FWIW, this seems to be the docs for the Sequence Object: http://www.bioperl.org/wiki/HOWTO:Beginners#The_Sequence_Object. "For example, to get or retrieve a value `$sequence_as_string = $seq_obj->seq;`" – doublesharp Oct 27 '15 at 22:39
3

I'm going to dispute the other answer, and say - the correct way to access object properties is not to do so, and use a method instead.

The reason for doing this is the whole point of OO. Which is to encapsulate chunks of your program, such that multiple developers can work with it concurrently, and the code scales because you can find where things are going wrong more easily.

This only works if you used published methods - the specified way of driving the object - because then you don't have to know what's going on behind the scenes. It also means the implementor is free to change what is going on - maybe simply validating, but maybe overloading or having different responses depending on another property within the object.

All this is subverted by direct access to object properties.

You shouldn't do it, even if perl will "let" you. Let's face it, perl will let you do many bad things.

Bio::PrimarySeq has a method call of seq. to retrieve the seq() attribute. Bio::Seq has an accessor for the primary sequence:

So:

$prot_obj -> seq(); 

I think would probably do it. (Although, the doc isn't exactly easy reading).

Sobrique
  • 52,974
  • 7
  • 60
  • 101
1

There is an accepted answer but I would also advise against poking around in the intervals of objects like that with the only exception being to see what kind of object is returned (or just use ref). Here is how I would approach the problem:

use 5.010;
use strict;
use warnings;
use Bio::DB::GenBank;
use Bio::Seq;

my $gb = Bio::DB::GenBank->new(
    -format     => 'Fasta',
    -seq_start  => 1,
    -seq_stop   => 251,
    -strand     => 1,
    -complexity => 1
);

my $seq = $gb->get_Seq_by_acc('NG_016346');
my $seq_obj = Bio::Seq->new(
    -id       => $seq->id,
    -seq      => $seq->seq,
    -alphabet => 'dna' 
);

say join "\n", ">".$seq_obj->id, $seq_obj->translate->seq;

Running this gives you the translated FASTA record:

>gi|283837914:1-251
RLCVKEGPWPAVEGTWSWG*HRPGSRACPRWGAPNSVQATSYTPSPTHAPFSVSPIPIC*MSLLEASCWPGSREDGARMSAGM

The real benefit of using BioPerl is in combining the different classes together to solve problems with minimal (but also readable and reusable) code. There was also a minor typo in your code that would have been caught with strict and warnings pragmas enabled (that is my best advice).

SES
  • 850
  • 1
  • 9
  • 21