-1

The problem with the following code is only in one function of the code. The problem function is with a comment head and close. This is my first post to StackOverflow so bear with me. The following script has some modules and other functions that I know work by testing them with the problem function commented out but I just cannot seem to get that one function to work. When ran, the script runs until the enviroment kills the execution.

Basically what this program does is takes a PDB file, copies everything out of the PDB file and creates a new one and pastes all of the original input file content into the new file and appends the cavities(coordinates of center of the cavity and the specified probe radius) that the program is supposed to find.

The problem function within the code is supposed to distinguish between a void space within a bound box of the structure and a cavity. Cavities are considered to be a closed space somewhere within the structure. A void space is any space or coordinate within the bounding box of max and min coorindates where there isn't an atom.The cavity must be large enough to fit into a specified probe radius. There is also a specified resolution when searching through the 3D hashtable of coordinates.

Can anyone tell me why my code isn't working. Anything immediate. I have tested and tested and cannot seem to find the error.

Thank you.

    #!/usr/bin/perl
# Example 11-6   Extract atomic coordinates from PDB file

use strict;
use warnings;
use BeginPerlBioinfo;     # see Chapter 6 about this module

#open file for printing
open(FH,">results.pdb");

open(PDB,"oneAtom.pdb");
while(<PDB>) { print FH $_; }
close(PDB);

# Read in PDB file
my @file = get_file_data('oneAtom.pdb');

# Parse the record types of the PDB file
my %recordtypes = parsePDBrecordtypes(@file);

# Extract the atoms of all chains in the protein
my %atoms = parseATOM ( $recordtypes{'ATOM'} );

#define some variables and get the atom indices stored in atom_numbers array
my @atom_numbers = sort {$a <=> $b} keys %atoms;
my $resolution = 4.;
my $lo         = 1000;
my $hi         = -1000;
my $p_rad = 1;
my %pass;

#set the grid boundaries
foreach my $l ( @atom_numbers ) { 
                for my $i (0..2) { 
                    if ( $atoms{$l}[$i] < $lo ) { $lo = $atoms{$l}[$i]; } 
                    if ( $atoms{$l}[$i] > $hi ) { $hi = $atoms{$l}[$i]; } 
                    } 
    }
$lo = $lo - 2* $resolution;
$hi = $hi + 2* $resolution;

#compute min distance to the pdb structure from each grid point
for ( my $i = $lo ; $i <= $hi ; $i = $i + $resolution ) {
    for ( my $j = $lo ; $j <= $hi ; $j = $j + $resolution ) {
        for ( my $k = $lo ; $k <= $hi ; $k = $k + $resolution ) {
            my $min_dist = 1000000;
            foreach my $l ( @atom_numbers ) {
                my $distance = sqrt((($atoms{$l}[0]-($i))*($atoms{$l}[0]-($i))) + (($atoms{$l}[1]-($j))*($atoms{$l}[1]-($j))) + (($atoms{$l}[2]-($k))*($atoms{$l}[2]-($k))));
                $distance = $distance - ( $p_rad + $atoms{$l}[3] );
                if ( $distance < $min_dist ) {
                    $min_dist = $distance;  
                    }
            }
            $pass{$i}{$j}{$k} = $min_dist;
            if ( $pass{$i}{$j}{$k} > 0 ) {
                         $pass{$i}{$j}{$k} = 1;
                } else { $pass{$i}{$j}{$k} = 0;
                }   
        }
    }
}

#define a starting point on the outside of the grid and place first on list of points
#my   @point = ();
my   $num_cavities = 0;

#define some offsets used to compute neighbors
my %offset =  (
     1 => [-1*$resolution,0,0],
     2 => [1*$resolution,0,0],
     3 => [0,-1*$resolution,0],
     4 => [0,1*$resolution,0],
     5 => [0,0,-1*$resolution],
     6 => [0,0,1*$resolution],
              );
##########################################################
#function below with problem
##########################################################
my   @point = ();
push @point,[$hi,$hi,$hi];
=pod
#do the following while there are points on the list

            while ( @point ) {
                foreach my $vector ( keys %offset ) {                                                       #for each offset vector
                    my @neighbor = (($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2]));              #compute neighbor point
                    if ( exists  $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} ) {                                    #see if neighbor is in the grid
                        if ( $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} == 1 ) {                                                                           #if it is see if its further than the probe radius
                            push @point,[($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2])];         #if it is push it onto the list of points
                            }
                        }
                    }
                $pass{$point[0][0]}{$point[0][1]}{$point[0][2]} = 0;                                            #eliminate the point just tested from the pass array
                shift @point;                                                                   #move to the next point in the list
                }
=cut
##############################################################
# end of problem function
##############################################################

my $grid_ind = $atom_numbers[$#atom_numbers];
for ( my $i = $lo ; $i <= $hi ; $i = $i + $resolution ) {
    for ( my $j = $lo ; $j <= $hi ; $j = $j + $resolution ) {
        for ( my $k = $lo ; $k <= $hi ; $k = $k + $resolution ) {
            if ( $pass{$i}{$j}{$k} == 1 ) {
            $grid_ind = $grid_ind + 1;
            my $n = sprintf("%5d",$grid_ind);
            my $x = sprintf("%7.3f",$i);
            my $y = sprintf("%7.3f",$j);
            my $z = sprintf("%7.3f",$k);
            my $w = sprintf("%6.3f",1);
            my $p = sprintf("%6.3f",$p_rad);
            print FH "ATOM  $n MC   CAV $n     $x $y $z $w  $p \n";
            }
        }
    }
}

close(FH);

exit;

#do the following while there are points on the list
for ( my $i = $lo ; $i <= $hi ; $i = $i + $resolution ) {
    for ( my $j = $lo ; $j <= $hi ; $j = $j + $resolution ) {
        for ( my $k = $lo ; $k <= $hi ; $k = $k + $resolution ) {
            if ( $pass{$i}{$j}{$k} == 1 ) {
            push @point,[$i,$j,$k];
            $num_cavities++;
            while ( @point ) {
                foreach my $vector ( keys %offset ) {                                                       #for each offset vector
                    my @neighbor = (($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2]));              #compute neighbor point
                    if ( exists  $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} ) {                                    #see if neighbor is in the grid
                        if ( $pass{$neighbor[0]}{$neighbor[1]}{$neighbor[2]} == 1 ) {                                                                           #if it is see if its further than the probe radius
                            push @point,[($point[0][0]+$offset{$vector}[0]),($point[0][1]+$offset{$vector}[1]),($point[0][2]+$offset{$vector}[2])];         #if it is push it onto the list of points
                            }
                        }
                    }
                $pass{$point[0][0]}{$point[0][1]}{$point[0][2]} = 0;                                            #eliminate the point just tested from the pass array
                shift @point;                                                                   #move to the next point in the list
                }
            }
        }
    }
}

#print the results
print "\nthe structure has " . $num_cavities . " cavities.\n\n";

#print the point that are left over (these correspond to the cavities)
#for ( my $i = -10 ; $i <= 10 ; $i = $i + $resolution ) {
#   for ( my $j = -10 ; $j <= 10 ; $j = $j + $resolution ) {
#       for ( my $k = -10 ; $k <= 10 ; $k = $k + $resolution ) {
#           print $i . "\t" . $j . "\t" . $k . "\t" . $pass{$i}{$j}{$k} . "\n";
#       }
#   }
#}

###################################################
# function
###################################################

sub parseATOM {

    my($atomrecord) = @_;

    use strict;
    use warnings;
    my %results = (  );

    # Turn the scalar into an array of ATOM lines
    my(@atomrecord) = split(/\n/, $atomrecord);

    foreach my $record (@atomrecord) {
       my $number  = substr($record,  6, 5);  # columns 7-11
       my $x       = substr($record, 30, 8);  # columns 31-38
       my $y       = substr($record, 38, 8);  # columns 39-46
       my $z       = substr($record, 46, 8);  # columns 47-54
       my $r       = substr($record, 60, 6);  # columns 47-54
       #my $element = substr($record, 76, 2);  # columns 77-78

        # $number and $element may have leading spaces: strip them
        $number =~ s/\s*//g;
        #$element =~ s/\s*//g;
        $x =~ s/\s*//g;
        $y =~ s/\s*//g;
        $z =~ s/\s*//g;
        $r =~ s/\s*//g;

        # Store information in hash
        #$results{$number} = [$x,$y,$z,$element];
        $results{$number} = [$x,$y,$z,$r];
    }

    # Return the hash
    return %results;
}
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
  • We really need much more detail as to what's going wrong with your code. - and MCVE for example, or some indication of what you _expect_ and what you _get_. – Sobrique May 06 '15 at 10:55

2 Answers2

0

Here's one thing that is almost certainly slowing things down:

    $x =~ s/\s*//g;
    $y =~ s/\s*//g;
    $z =~ s/\s*//g;
    $r =~ s/\s*//g;

It is possible for \s* to match an empty string, so you are replacing empty strings with empty strings, for each empty string in the target string.

Change to:

    $x =~ s/\s+//g;
    $y =~ s/\s+//g;
    $z =~ s/\s+//g;
    $r =~ s/\s+//g;
Andy Lester
  • 91,102
  • 13
  • 100
  • 152
  • Comment says "may have leading spaces". So, `s/^\s+// for $number, $element, $x, $y, $z, $r;`. Of course, the rest of the code is unreadable as well. – Sinan Ünür May 05 '15 at 22:16
0

You have the following definitions:

my $lo         = 1000;
my $hi         = -1000;

So when you get to your first for loop, you will set $i to 1000, and then fail the check to see if it is less than -1000.

Degustaf
  • 2,655
  • 2
  • 16
  • 27