Read a table according to a certain number of characters

Question

I want to extract the name of comets from my table held in a text file. However some comets are 1-worded, others are 2-worded, and some are 3-worded. My table looks like this:

9P/Tempel 1                      1.525  0.514  10.5   5.3   2.969
27P/Crommelin                    0.748  0.919  29.0  27.9   1.484
126P/IRAS                        1.713  0.697  45.8  13.4   1.963
177P/Barnard                     1.107  0.954  31.2 119.6   1.317
P/2008 A3 (SOHO)                 0.049  0.984  22.4   5.4   1.948
P/2008 Y11 (SOHO)                0.046  0.985  24.4   5.3   1.949
C/1991 L3 Levy                   0.983  0.929  19.2  51.3   1.516

However, I know that the name of the comets is from character 5 till character 37. How can I write a code to tell python that the first column is from character 5 till character 37?

in which format this table exist. is it database or csv ? – sundar nataraj Jul 18 '14 at 09:17 — sundar nataraj, Jul 18 '14 at 09:17
it is a .txt file. A simple text file. – aloha Jul 18 '14 at 09:22 — aloha, Jul 18 '14 at 09:22
is `P/,C/` etc.. part of the name? – Padraic Cunningham Jul 18 '14 at 09:28 — Padraic Cunningham, Jul 18 '14 at 09:28

DavidK · Accepted Answer · 2014-07-18T09:47:28.437

data = """9P/Tempel 1                      1.525  0.514  10.5   5.3   2.969
27P/Crommelin                    0.748  0.919  29.0  27.9   1.484
126P/IRAS                        1.713  0.697  45.8  13.4   1.963
177P/Barnard                     1.107  0.954  31.2 119.6   1.317
P/2008 A3 (SOHO)                 0.049  0.984  22.4   5.4   1.948
P/2008 Y11 (SOHO)                0.046  0.985  24.4   5.3   1.949
C/1991 L3 Levy                   0.983  0.929  19.2  51.3   1.516""".split('\n')

To read the whole file you can use

f = open('data.txt', 'r').readlines()

It seems that you have columns that you can use. If you're only interested in the first column then :

len("9P/Tempel 1                      ")

It gives 33.

So,

Extract the first column :

for line in data:
    print line[:33].strip()

Here what's printed :

9P/Tempel 1

27P/Crommelin

126P/IRAS

177P/Barnard

P/2008 A3 (SOHO)

P/2008 Y11 (SOHO)

C/1991 L3 Levy

If what you want is :

Tempel 1
Crommelin
IRAS
...

You have to use a regular expression. Example :

reg = '.*?/[\d\s]*(.*)'
print re.match(reg, '27P/Crommelin').group(1)
print re.match(reg, 'C/1991 L3 Levy').group(1)

Here's the output :

Crommelin
L3 Levy

You can also take a glance to the read_fwf of the python pandas library. It allows to parse your file specifying the number of characters per columns.

Read a table according to a certain number of characters

1 Answers1