-2

I have a file like:

<space>
<space>
line1
<space>
column 1    column 2    column 3   ...

.
.
.


<space>
<space>

How to remove this extra spaces?

I need to extract the heading which will be on line1. Also, I need to extract column 1, column 2, column 3 etc.

At the end of last column content there is '\n'.How to get rid of it ???

Help me with this...

Thank you

user46646
  • 153,461
  • 44
  • 78
  • 84

3 Answers3

4

Start by opening the file and reading all the lines:

f = open('filename string');
lines = f.readlines()

Then...

# remove empty lines
lines = [l for l in lines if len(l.strip()) > 0]
header = lines[0]
line = lines[1].split(' ')
column1 = line[0]
column2 = line[1]
...

Also:

total_lines = len(lines)
total_columns = len(line)
Joao da Silva
  • 7,353
  • 2
  • 28
  • 24
  • You can use the strip() method of strings. The l.strip() expression should have removed it for you, though. – Joao da Silva Jan 30 '09 at 13:56
  • To be more precise - the l.strip() removes both trailing and leading spaces. If (for some reason) you want to preserve spaces that are in front of the first column - use l.rstrip() instead. – Abgan Jan 30 '09 at 21:47
1

A straightforward solution, using strip() to drop spaces and split() to separate column data:

>>> mylines
[' \n', ' \n', 'line1\n', ' \n', ' \n', 'column1    column2    column3 \n']
>>> def parser(lines):
...     header=""
...     data=[]
...     for x in lines:
...         line = x.strip()
...         if line == "":
...             continue
...         if header == "":
...             header=line
...         else:
...             data.append(line.split())
...     return {"header":header,"data":data}
... 
>>> parser(mylines)
{'header': 'line1', 'data': [['column1', 'column2', 'column3']]}
>>> 
gimel
  • 83,368
  • 10
  • 76
  • 104
0

Using Generator functions to handle each element of parsing

def nonEmptyLines( aFile ):
    """Discard empty lines, yield only non-empty lines."""
    for line in aFile:
        if len(line) > 0:
            yield line

def splitFields( aFile ):
    """Split a non-empty line into fields."""
    for line in nonEmptyLines(aFile):
        yield line.split()

def dictReader( aFile ):
    """Turn non-empty lines file with header and data into dictionaries.
    Like the ``csv`` module."""
    iter= iter( splitFields( aFile ) )
    heading= iter.next()
    for line in iter:
        yield dict( zip( heading, line ) )

rdr= dictReader( "myFile", "r" )
for d in rdr:
    print d
S.Lott
  • 384,516
  • 81
  • 508
  • 779