0

I have problems with running a bash script inside a python script script.py:

import os
bashCommand = """
sed "s/) \['/1, color=\"#ffcccc\", label=\"/g" list.txt | sed 's/\[/    GraphicFeature(start=/g' | sed 's/\:/, end=/g' | sed 's/>//g' | sed 's/\](/, strand=/g' | sed "s/'\]/\"),/g" >list2.txt"""

os.system("bash %s" % bashCommand)

When I run this as python script.py, no list2.txt is written, but on the terminal I see that I am inside bash-4.4 instead of the native macOS bash.

Any ideas what could cause this?

The script I posted above is part of a bigger script, where first it reads in some file and outputs list.txt.

edit: here comes some more description In a first python script, I parsed a file (genbank file, to be specific), to write out a list with items (location, strand, name) into list.txt. This list.txt has to be transformed to be parsable by a second python script, therefore the sed.

list.txt

[0:2463](+) ['bifunctional aspartokinase/homoserine dehydrogenase I']
[2464:3397](+) ['Homoserine kinase']
[3397:4684](+) ['Threonine synthase']

all the brackets, :, ' have to be replaced to look like desired output list2.txt

    GraphicFeature(start=0, end=2463, strand=+1, color="#ffcccc", label="bifunctional aspartokinase/homoserine dehydrogenase I"),
    GraphicFeature(start=2464, end=3397, strand=+1, color="#ffcccc", label="Homoserine kinase"),
    GraphicFeature(start=3397, end=4684, strand=+1, color="#ffcccc", label="Threonine synthase"),
rororo
  • 815
  • 16
  • 31
  • This looks much more complicated than it needs to be. What are you actually trying to do? At the very least, you should be able to reduce this to a single call to `sed`, using multiple `-e` arguments, and handling the output redirection with `subprocess.call`. – chepner May 28 '17 at 19:45
  • `with open("list2.txt", "w") as fh: subprocess.call(["sed", ...], stdout=fh)` – chepner May 28 '17 at 19:52
  • @chepner I know I should be able to reduce that or even implement in python, but I am not unfortunately... – rororo May 28 '17 at 19:59
  • Hence, me asking for a description of what you are trying to accomplish, so that I can suggest a pure Python solution. What's the input and expected output? – chepner May 28 '17 at 20:02
  • @chepner post edited – rororo May 28 '17 at 20:09

1 Answers1

1

Read the file in Python, parse each line with a single regular expression, and output an appropriate line constructed from the captured pieces.

import re
import sys

#                         1     2                3
#                        ---   ---              --
regex = re.compile(r"^\[(\d+):(\d+)\]\(\+\) \['(.*)'\]$")
# 1 - start value
# 2 - end value
# 3 - text value
with open("list2.txt", "w") as out:
    for line in sys.stdin:
        line = line.strip()
        m = regex.match(line)
        if m is None:
            print(line, file=out)
        else:
            print('GraphicFeature(start={}, end={}, strand=+1, color="#ffcccc", label="{}"),'.format(*m.groups()), file=out)

I output lines that don't match the regular expression unmodified; you may want to ignore them altogether or report an error instead.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • `print(line, file=out) SyntaxError: invalid syntax` – rororo May 28 '17 at 20:29
  • I assumed Python 3; add `from __future__ import print_function` to the top of your file to make it work in Python 2.x (assuming you aren't using a truly ancient version of Python). – chepner May 28 '17 at 20:35