1

Consider a file like this:

 Men      super men        size       Energy (J)    type    num      g
 ----------------------------------------------------------------------
  50          1             1          1.0234E+03    A      abcd   12.1
  20          7             4          5.0211E+02    A2 C   agcd   14.1
  10          2             3         -1.0347E+02    B2     abkd   72.1

As you can see, the file here has a fixed width with the column "type" actually has a space in it but in fact the whole "A2 C" is one data.

I am currently splitting each line into strings at fixed positions (hard-coded) inside a for loop, creating a dictionary. Is there a pythonic way to read this data into pandas preserving the column names and types?

wander95
  • 1,298
  • 1
  • 15
  • 22
  • Is the data tab-separated? – alkasm Feb 27 '19 at 21:37
  • space separated, rather spaces – wander95 Feb 27 '19 at 21:37
  • 1
    if you get rid of that dashed line you should be able to read it in as a fixed width file – gold_cy Feb 27 '19 at 21:38
  • Is it possible that there's only one space between data in two columns? For e.g. could your second row in that table have `A2 C agcd` or is there always at least two spaces separating the data at one column and the next? Because you could possibly split on two spaces instead of one, and then strip. Edit: Okay, fixed width file seems much better, didn't know that existed! – alkasm Feb 27 '19 at 21:39
  • 2
    [`pandas.read_fwf`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_fwf.html) – pault Feb 27 '19 at 21:39
  • https://gist.github.com/miraculixx/26aeb6d614c8adde95aff3719d5c4119 – miraculixx Feb 27 '19 at 22:19
  • @miraculixx, I had something similar with a few `if` statements. Was looking for something that sounded like a pythonic one-liner – wander95 Feb 27 '19 at 22:26
  • @wander95 well it is, all you need is `read_text(text)` - it doesn't get more oneliner than this, just think of `read_text` as a library function (put it into your `utils.py`). – miraculixx Feb 27 '19 at 22:32
  • @wander95 actually, I found a solution in 2 lines, https://gist.github.com/miraculixx/26aeb6d614c8adde95aff3719d5c4119#gistcomment-2849586 – miraculixx Feb 27 '19 at 22:41
  • @miraculixx Thanks! – wander95 Feb 27 '19 at 22:47
  • @wander95 however, note I wouldn't necessarily call the 2-liner Pythonic, to see why just do `python -c "import this"` from your command line ;-) Granted, it's probably as terse as it gets in most languages. – miraculixx Feb 27 '19 at 22:52

0 Answers0