0

Hello Everyone, I had a problem regarding a Perl Module as I am using this module to retrieve some specific lines form a flat file that contains multiple sets of information as I had mentioned in code.(This is an example code of Bio::Parse::SwissProt.pm). But the problem is that whenever we are working with this code, it has a problem in Refs statement. It is giving an error as modification of read-only value attempted atc:/wamp/bin/perl/site/lib/bio/parse/swissprot.pm line 345. Input file looks like this

Input File(Flate file)

ID   P72354_STAAU            Unreviewed;       575 AA.
AC   P72354;
DT   01-FEB-1997, integrated into UniProtKB/TrEMBL.
DT   01-FEB-1997, sequence version 1.
DT   29-MAY-2013, entry version 79.
DE   SubName: Full=ATP-binding cassette transporter A;
GN   Name=abcA;
OS   Staphylococcus aureus.
OC   Bacteria; Firmicutes; Bacilli; Bacillales; Staphylococcus.
OX   NCBI_TaxID=1280;
RN   [1]
RP   NUCLEOTIDE SEQUENCE.
RC   STRAIN=NCTC 8325;
RX   PubMed=8878592;
RA   Henze U.U., Berger-Bachi B.;
RT   "Penicillin-binding protein 4 overproduction increases beta-lactam
RT   resistance in Staphylococcus aureus.";
RL   Antimicrob. Agents Chemother. 40:2121-2125(1996).
RN   [2]
RP   NUCLEOTIDE SEQUENCE.
RC   STRAIN=NCTC 8325;
RX   PubMed=9158759;
RA   Henze U.U., Roos M., Berger-Bachi B.;
RT   "Effects of penicillin-binding protein 4 overproduction in
RT   Staphylococcus aureus.";
RL   Microb. Drug Resist. 2:193-199(1996).
 CC   -!- SIMILARITY: Belongs to the ABC transporter superfamily.
CC   -----------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution-NoDerivs License
CC   -----------------------------------------------------------------------
DR   EMBL; X91786; CAA62898.1; -; Genomic_DNA.
DR   ProteinModelPortal; P72354; -.
DR   SMR; P72354; 335-571.
DR   GO; GO:0016021; C:integral to membrane; IEA:InterPro.
DR   GO; GO:0005524; F:ATP binding; IEA:UniProtKB-KW.
DR   GO; GO:0042626; F:ATPase activity
DR   GO; GO:0006200; P:ATP catabolic process; IEA:GOC.
DR   InterPro; IPR003593; AAA+_ATPase.
DR   InterPro; IPR003439; ABC_transporter-like.
DR   InterPro; IPR017871; ABC_transporter_CS.
DR   InterPro; IPR017940; ABC_transporter_type1.
DR   InterPro; IPR001140; ABC_transptr_TM_dom.
DR   InterPro; IPR011527; ABC_transptrTM_dom_typ1.
DR   InterPro; IPR027417; P-loop_NTPase.
DR   Pfam; PF00664; ABC_membrane; 1.
DR   Pfam; PF00005; ABC_tran; 1.
DR   SMART; SM00382; AAA; 1.
DR   SUPFAM; SSF90123; ABC_TM_1; 1.
DR   SUPFAM; SSF52540; SSF52540; 1.
DR   PROSITE; PS50929; ABC_TM1F; 1.
DR   PROSITE; PS00211; ABC_TRANSPORTER_1; 1.
DR   PROSITE; PS50893; ABC_TRANSPORTER_2; 1.
PE   3: Inferred from homology;
KW   ATP-binding; Nucleotide-binding.
SQ   SEQUENCE   575 AA;  64028 MW;  F7E30A85971719B9 CRC64;
     MKRENPLFFL FKKLSWPVGL IVAAITISSL GSLSGLLVPL FTGRIVDKFS VSHINWNLIA
     LFGGIFVINA LLSGLGLYLL SKIGEKIIYA IRSVLWEHII QLKMPFFDKN ESGQLMSRLT
     DDTKVINEFI SQKLPNLLPS IVTLVGSLIM LFILDWKMTL LTFITIPIFV LIMIPLGRIM
     QKISTSTQSE IANFSGLLGR VLTEMRLVKI SNTERLELDN AHKNLNEIYK LGLKQAKIAA
     VVQPISGIVM LLTIAIILGF GALEIATGAI TAGTLIAMIF YVIQLSMPLI NLSTLVTDYK
     KAVGASSRIY EIMQEPIEPT EALEDSENVL IDDGVLSFEH VDFKYDVKKI LDDVSFQIPQ
     GQVSAFVGPS GSGKSTIFNL IERMYEIESG DIKYGLESVY DIPLSKWRRK IGYVMQSNSM
     MSGTIRDNIL YGINRHVSDE ELINYAKLAN CHDFIMQFDE GYDTLVGERG LKLSGGQRQR
     IDIARSFVKN PDILLLDEAT ANLDSESELK IQEALETLME GRTTIVIANR LSTIKKAGQI
     IFLDKGQVTG KGTHSELMAS HAKYKNFVVS QKLTD
//

Script part C:/wamp/bin/perl/bin/perl.exe

use strict;
use warnings;
use Data::Dumper;
use SWISS::Entry;
use Bio::Parse::SwissProt;
my $sp = Bio::Parse::SwissProt->new(FILE =>"me.txt")or die $!;

# Read in all the entries and fill %entries
my $entry_name =  $sp->entry_name( );
print "$entry_name\n";
my $seq_len = $sp->seq_len( );
print "$seq_len\n";
$refs = $sw->refs();
$refs = $sw->refs(TITLE => 1, AUTH => 1);
for my $i (0..$#{$refs}) {
    print "@{$refs->[$i]}\n";

OUTPUT should be like

[1]
  NUCLEOTIDE SEQUENCE.
  STRAIN=NCTC 8325;
  PubMed=8878592;
  Henze U.U., Berger-Bachi B.;
  "Penicillin-binding protein 4 overproduction increases beta-lactam
  resistance in Staphylococcus aureus.";
  Antimicrob. Agents Chemother. 40:2121-2125(1996).
[2]
  NUCLEOTIDE SEQUENCE.
  STRAIN=NCTC 8325;
  PubMed=9158759;
  Henze U.U., Roos M., Berger-Bachi B.;
  "Effects of penicillin-binding protein 4 overproduction in
  Staphylococcus aureus.";
  Microb. Drug Resist. 2:193-199(1996).
</code></pre>
Toto
  • 89,455
  • 62
  • 89
  • 125
meghavarshney
  • 89
  • 1
  • 1
  • 8
  • 1
    You say the error is on line 345 of the Perl module but you only show 16 lines of it. Have you shown line 345 and, if so, which is it? What is the full text of the error message? The code you show has two assignments to `$refs` with nothing between; that looks suspicious. – AdrianHHH Jul 05 '13 at 12:03
  • @AdrianHHH 365 line related with perl module /bio/parse/swissprot.pm not with my code .... i has pasted both refs line because i would like to shows you that how am i using this code (i know they both are not interrelated with each other but their are two ways to retrieve reference line from file so i had pasted both)...............and last, code is working finely but i think problem is that i am missing some call to retrieve refrence values because '"$entry_name\n"; print "$seq_len\n";' i am getting results for these lines. – meghavarshney Jul 09 '13 at 12:23

1 Answers1

1

After some searching on the internet, it appears that you are using SWISS::Entry from the Swissknife package, and it appears you (or someone) downloaded Bio::Parse::SwissProt as an independent project (not part of BioPerl) from sourceforge. I am not familiar with either of these projects, but you can get the information you want by simply using Bio::SeqIO from BioPerl. Here is an example to get the refs:

#!usr/bin/env perl

use strict;
use warnings;
use Bio::SeqIO;

my $usage = "perl $0 swiss-file\n";
my $infile = shift or die $usage;

my $io = Bio::SeqIO->new(-file => $infile, -format => 'swiss');
my $seqio = $io->next_seq;
my $anno_collection = $seqio->annotation;

for my $key ( $anno_collection->get_all_annotation_keys ) {
    my @annotations = $anno_collection->get_Annotations($key);
    for my $value ( @annotations ) {
        if ($value->tagname eq "reference") {
            my $hash_ref = $value->hash_tree;
            for my $key (keys %{$hash_ref}) {
                print $key,": ",$hash_ref->{$key},"\n" if defined $hash_ref->{$key};
            }
        }
    }
}

Running this gives the information you wanted:

authors: Henze U.U., Berger-Bachi B.
location: Antimicrob. Agents Chemother. 40:2121-2125(1996).
title: "Penicillin-binding protein 4 overproduction increases beta-lactam resistance in Staphylococcus aureus."
pubmed: 8878592
authors: Henze U.U., Roos M., Berger-Bachi B.
location: Microb. Drug Resist. 2:193-199(1996).
title: "Effects of penicillin-binding protein 4 overproduction in Staphylococcus aureus."
pubmed: 9158759

The BioPerl Feature Annotation HOWTO is a helpful page for parsing these types of files. If you want to fetch the entries and then parse them, you can use Bio::DB::Swissprot and add just a couple of lines of code to the above example. I know that is not an answer to your specific problem but it is a solution and you'll find that many people can help you with BioPerl.

SES
  • 850
  • 1
  • 9
  • 21
  • Thanks a lot for your prompt reply and your valuable time.......... This is what i exactly with Bio::Parse::SwissProt module.. thankx a lot ..... – meghavarshney Jul 06 '13 at 11:53
  • @meghavarshney, I don't fully understand your comment. Are you saying this answer does exactly what you want, or are you saying that this is still what you were trying to do with the other module? My point was that it may be easier to use BioPerl because I posted a working solution above and more people will be able to help. – SES Jul 06 '13 at 22:01
  • yes I mean to say i got my answer in form of your code (this is what i exactly want to extract with Bio::DB::Swissprot module) – meghavarshney Jul 08 '13 at 09:39
  • @meghavarshney, I'm glad it solved your problem. You can mark the question "solved" unless you are looking for another type of solution. If you are, please explain so I can expand my answer. – SES Jul 08 '13 at 19:47
  • Actualyy my problem is solved but i want to know that where i am incorrect when i am using Bio::DB::Swissprot module because if i am getting two other fields then its means some where mistakenly i am using wrong call to fetch ref line. – meghavarshney Jul 09 '13 at 12:25
  • @meghavarshney, for that question I recommend you create a new post and supply all the details so we can help. Otherwise, it is not possible to help based on your comments alone. – SES Jul 09 '13 at 17:21