-2

Have a data file consisting of a string (no tabs and no spaces and no column names). First two columns are equivalent to one piece of data, third column is another and 4 thru 7 are something else, etc.

How can I get these strings into a dataframe with named columns? All the answers I've seen assume that I have tabs, spaces etc.

ayhan
  • 70,170
  • 20
  • 182
  • 203
TIll
  • 15
  • 5
    Could you give an example of your data? I'm not sure what it means to have "columns" but also say that there are no "tabs, spaces etc." between the values. How do you know where one value stops and the next starts? – user94559 Jul 17 '16 at 19:51
  • Are you describing a "fixed-width format" file where each column is defined by a strict number of characters? If so, look at `pandas.read_fwf`. – BrenBarn Jul 17 '16 at 19:58

1 Answers1

3

You can use pd.read_fwf with widths parameter. A file with these contents:

ieafxfrjzyxfxkymiwuy
lqqmceegjnbjpxnidygr
zssawojanxbrfwkgbvnl
ahcwwhtayjwozzrgfftt

Becomes this:

pd.read_fwf('test.txt', widths = [2, 4, 3, 11], names=['first', 'second', 'third', 'fourth'])
Out[226]: 
  first second third       fourth
0    ie   afxf   rjz  yxfxkymiwuy
1    lq   qmce   egj  nbjpxnidygr
2    zs   sawo   jan  xbrfwkgbvnl
3    ah   cwwh   tay  jwozzrgfftt
ayhan
  • 70,170
  • 20
  • 182
  • 203