0

I have written this code

import sys
file = open(sys.argv[1], 'r')
string = ''
for line in file:
    if line.startswith(">"):
        pass
    else:
        string = string + line.strip()
#print (list(string))
w = input("Please enter window size:")
test = [string[i:i+w] for i in range (0,len(string),w)]
seq = input("Please enter the number of sequences you wish to read:")
#print (test[0:seq])

It generates a list which looks like this-

['TAAAACACCC', 'TCAATTCAAG', 'GGTTTTTGAG', 'CGAGCTTTTT', 'ACTCAAAGAA', 'TCCAAGATAG', 'CGTTTAAAAA', 'TTTAGGGGTG', 'TTAGGCTCAG', 'CATAGAGTTT']

Now the next step is to read the occurance of the letters GC (or can be CG) in each element of the list. Is there a way to loop through the list in such a way that the output file looks like:

Segment 1- The %GC is <the calculated number>
Segment 2- The %GC is <the calculated number>
Segment 3- The %GC is <the calculated number>

Since the file is wayy to large and the number of segments (each individual element of the list like 'TAAGATATA') i will be getting will be huge i do not know how to get the number (1,2,3...) of the segment in the output file. Also since I am new to python (and programming) I not very good at using functions very well.

begin.py
  • 161
  • 1
  • 1
  • 9
  • Show us your code that you have written so far, brother –  Jan 22 '13 at 17:26
  • I do not understand the question -- can you give a more explicite example, what are Segments in this context? – tzelleke Jan 22 '13 at 17:28
  • @TheodrosZelleke- Its a biological program. will take a lot of time to explain and is unnecessary. All I want is to loop through the file so that i can get the segment (every element of list = segemnt) number and its corresponding GC% (which I can take care off) – begin.py Jan 22 '13 at 17:42

2 Answers2

1

I'm not sure what you're asking.

inp = ['TAAAACACCC', 'TCAATTCAAG', 'GGTTTTTGAG', 'CGAGCTTTTT', 'ACTCAAAGAA', 'TCCAAGATAG', 'CGTTTAAAAA', 'TTTAGGGGTG', 'TTAGGCTCAG', 'CATAGAGTTT']

for i, segment in enumerate(inp):
    print "Segment {} - The %GC is {}".format(i, segment.count("GC"))

gives

Segment 0 - The %GC is 0
Segment 1 - The %GC is 0
Segment 2 - The %GC is 0
Segment 3 - The %GC is 1
Segment 4 - The %GC is 0
Segment 5 - The %GC is 0
Segment 6 - The %GC is 0
Segment 7 - The %GC is 0
Segment 8 - The %GC is 1
Segment 9 - The %GC is 0
Katriel
  • 120,462
  • 19
  • 136
  • 170
0

You could try the map function in python. http://docs.python.org/3.1/library/functions.html#map provides the general use of it, but here is an example using Python3.

def func1(myObject):
    '''Trivial example function'''
    return myObject * 2
myList = [1,2,3]
myMap = map(func1,myList)
print(list(myMap))

Map executes a method over each item in an iterable (e.g. list, string, etc) and places the result of each method execution into a map, which can then be printed out as a list or iterated over like a list.

If you wanted your myObject to be a list itself, that shouldn't be a problem, just so long as you use it like that accordingly.

Does this answer your question?

jheld
  • 148
  • 2
  • 8