-1

I want to extract the name of comets from my table held in a text file. However some comets are 1-worded, others are 2-worded, and some are 3-worded. My table looks like this:

9P/Tempel 1                      1.525  0.514  10.5   5.3   2.969
27P/Crommelin                    0.748  0.919  29.0  27.9   1.484
126P/IRAS                        1.713  0.697  45.8  13.4   1.963
177P/Barnard                     1.107  0.954  31.2 119.6   1.317
P/2008 A3 (SOHO)                 0.049  0.984  22.4   5.4   1.948
P/2008 Y11 (SOHO)                0.046  0.985  24.4   5.3   1.949
C/1991 L3 Levy                   0.983  0.929  19.2  51.3   1.516

However, I know that the name of the comets is from character 5 till character 37. How can I write a code to tell python that the first column is from character 5 till character 37?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
aloha
  • 4,554
  • 6
  • 32
  • 40

1 Answers1

1
data = """9P/Tempel 1                      1.525  0.514  10.5   5.3   2.969
27P/Crommelin                    0.748  0.919  29.0  27.9   1.484
126P/IRAS                        1.713  0.697  45.8  13.4   1.963
177P/Barnard                     1.107  0.954  31.2 119.6   1.317
P/2008 A3 (SOHO)                 0.049  0.984  22.4   5.4   1.948
P/2008 Y11 (SOHO)                0.046  0.985  24.4   5.3   1.949
C/1991 L3 Levy                   0.983  0.929  19.2  51.3   1.516""".split('\n')

To read the whole file you can use

f = open('data.txt', 'r').readlines()

It seems that you have columns that you can use. If you're only interested in the first column then :

len("9P/Tempel 1                      ")  

It gives 33.

So,

Extract the first column :

for line in data:
    print line[:33].strip()

Here what's printed :

9P/Tempel 1

27P/Crommelin

126P/IRAS

177P/Barnard

P/2008 A3 (SOHO)

P/2008 Y11 (SOHO)

C/1991 L3 Levy

If what you want is :

Tempel 1
Crommelin
IRAS
...

You have to use a regular expression. Example :

reg = '.*?/[\d\s]*(.*)'
print re.match(reg, '27P/Crommelin').group(1)
print re.match(reg, 'C/1991 L3 Levy').group(1)

Here's the output :

Crommelin
L3 Levy

You can also take a glance to the read_fwf of the python pandas library. It allows to parse your file specifying the number of characters per columns.

DavidK
  • 2,495
  • 3
  • 23
  • 38