I’m trying to write a code in python that will help me look for a string between two specific strings. When I implement the code with a single string, I get the desired output. However, I need to match the pattern in an array of sequences. It keeps throwing me an error.
defining a function to look for a pattern between two user specified sequence:
import re
def find_between(prefix, suffix, text):
pattern = r"{}\s*(.*)\s*{}".format(re.escape(prefix), re.escape(suffix))
result = re.search(pattern, text, re.DOTALL)
if result:
return result.group(1)
else:
return None
when I try a single string, it works:
text = "AGGTCCTGTAAACCT"
prefix = "TCCT"
suffix = "ACCT"
find_between(prefix, suffix, text)
output : 'GTAA'
But when I try reading the fastq file and implement the search, it does not:
seqs = readFastq('FN1.fastq')
text = seqs
prefix = "TCCT"
suffix = "ACCT"
find_between(prefix, suffix, text)
It throws me this error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-26-9c35672e7561> in <module>()
2 prefix = "TCCT"
3 suffix = "ACCT"
----> 4 find_between(prefix, suffix, text)
<ipython-input-19-5f42599c717f> in find_between(prefix, suffix, text)
3 def find_between(prefix, suffix, text):
4 pattern = r"{}\s*(.*)\s*{}".format(re.escape(prefix), re.escape(suffix))
----> 5 result = re.search(pattern, text, re.DOTALL)
6 if result:
7 return result.group(1)
/Users/shravantikrishna/anaconda/lib/python3.6/re.py in search(pattern, string, flags)
180 """Scan through string looking for a match to the pattern, returning
181 a match object, or None if no match was found."""
--> 182 return _compile(pattern, flags).search(string)
183
184 def sub(pattern, repl, string, count=0, flags=0):
TypeError: expected string or bytes-like object