Pandas does not separate first two columns when reading in fixed-width file

Question

I'm trying to read a fixed-width file into Python using Pandas but the first two columns are returned as one.

Here is a sample of the file I am trying to read in:

Some header information
          Date            day           value
    01/01/2015         000001           3.14
    01/02/2015              2           1.59

and here is my code:

import pandas as pd

my_data = pd.read_fwf(my_file, skiprows=1)

but upon inspection of my_data the first two columns are not separated:

> my_data.keys()
array(['Date            day', 'value'])

I know that my columns all have a width of 15 characters -- however I have several files with different numbers of columns and the widths option seems to expect a known number of columns (e.g. [(0, 15), (16, 30), ...]) rather an being able to specify the widths but not the number of columns.

Does anyone know how to get pandas to recognize the first two columns are distinct?

You may get better joy just reading it in as a plain text file: `my_data = pd.read_csv(my_file, skiprows=1, sep='\s+')` — EdChum, Sep 28 '15 at 16:37
I edited the data you posted so the code exhibits the behavior you describe. Please check to see if the change is consistent with your real data. — unutbu, Sep 28 '15 at 16:37
@unutbu thanks for solving the problem –– apparently this is easily replicated by making the value longer than the header. — Ellis Valentiner, Sep 28 '15 at 16:51
@user12202013: Ah -- that's insightful. It appears the `FixedWidthReader.detect_colspecs` method does not ignore rows specified by `skiprows=1`. A workaround would be to open the file, advance the filehandle to skip the first row, and then pass the filehandle as the first argument to `pd.read_fwf`. — unutbu, Sep 28 '15 at 16:59

Pandas does not separate first two columns when reading in fixed-width file

0 Answers0