File and space in Python

Question

I have a file like:

<space>
<space>
line1
<space>
column 1    column 2    column 3   ...

.
.
.


<space>
<space>

How to remove this extra spaces?

I need to extract the heading which will be on line1. Also, I need to extract column 1, column 2, column 3 etc.

At the end of last column content there is '\n'.How to get rid of it ???

Help me with this...

Thank you

Is there any delimiter between the columns? – Evan Fosmark Jan 30 '09 at 09:14 — Evan Fosmark, Jan 30 '09 at 09:14

score 4 · Accepted Answer · answered Jan 30 '09 at 09:18

4

Start by opening the file and reading all the lines:

f = open('filename string');
lines = f.readlines()

Then...

# remove empty lines
lines = [l for l in lines if len(l.strip()) > 0]
header = lines[0]
line = lines[1].split(' ')
column1 = line[0]
column2 = line[1]
...

Also:

total_lines = len(lines)
total_columns = len(line)

answered Jan 30 '09 at 09:18

Joao da Silva

7,353
2
28
24

You can use the strip() method of strings. The l.strip() expression should have removed it for you, though. – Joao da Silva Jan 30 '09 at 13:56
To be more precise - the l.strip() removes both trailing and leading spaces. If (for some reason) you want to preserve spaces that are in front of the first column - use l.rstrip() instead. – Abgan Jan 30 '09 at 21:47

score 1 · Answer 2 · answered Jan 30 '09 at 09:43

A straightforward solution, using strip() to drop spaces and split() to separate column data:

>>> mylines
[' \n', ' \n', 'line1\n', ' \n', ' \n', 'column1    column2    column3 \n']
>>> def parser(lines):
...     header=""
...     data=[]
...     for x in lines:
...         line = x.strip()
...         if line == "":
...             continue
...         if header == "":
...             header=line
...         else:
...             data.append(line.split())
...     return {"header":header,"data":data}
... 
>>> parser(mylines)
{'header': 'line1', 'data': [['column1', 'column2', 'column3']]}
>>>

score 0 · Answer 3 · answered Jan 30 '09 at 11:17

Using Generator functions to handle each element of parsing

def nonEmptyLines( aFile ):
    """Discard empty lines, yield only non-empty lines."""
    for line in aFile:
        if len(line) > 0:
            yield line

def splitFields( aFile ):
    """Split a non-empty line into fields."""
    for line in nonEmptyLines(aFile):
        yield line.split()

def dictReader( aFile ):
    """Turn non-empty lines file with header and data into dictionaries.
    Like the ``csv`` module."""
    iter= iter( splitFields( aFile ) )
    heading= iter.next()
    for line in iter:
        yield dict( zip( heading, line ) )

rdr= dictReader( "myFile", "r" )
for d in rdr:
    print d

File and space in Python

3 Answers3