7

How can I change every element in a DataFrame with hierarchical indexing? For example, maybe I want to convert strings into floats:

from pandas import DataFrame
f = DataFrame({'a': ['1,000','2,000','3,000'], 'b': ['2,000','3,000','4,000']})
f.columns = [['level1', 'level1'],['item1', 'item2']]
f
Out[152]:
        level1
     item1   item2
0    1,000   2,000
1    2,000   3,000
2    3,000   4,000

I tried this:

def clean(group):
    group = group.map(lambda x: x.replace(',', ''))
    return group
f.apply(clean)
Out[153]:
(level1, item1) (level1, item2)
0    1000    2000
1    2000    3000
2    3000    4000

As you can see, it changes the hierarchical indexing quite a bit. How can I avoid this? Or maybe there is a better way.

Thanks

r_31415
  • 8,752
  • 17
  • 74
  • 121

1 Answers1

4

Pass the axis option to the apply function:

In [265]: f.apply(clean, axis=1)
Out[265]:
  level1
   item1 item2
0   1000  2000
1   2000  3000
2   3000  4000

When both axes have hierarchical indices here's a workaround:

In [316]: f.index = [[1,2,3],[1,2,3]]

In [317]: f
Out[317]:
    level1
     item1  item2
1 1  1,000  2,000
2 2  2,000  3,000
3 3  3,000  4,000

In [314]: f.apply(clean, axis=1).reindex(f.index)
Out[314]:
    level1
     item1 item2
1 1   1000  2000
2 2   2000  3000
3 3   3000  4000
Zelazny7
  • 39,946
  • 18
  • 70
  • 84
  • Great. Does apply pass the columns including item1, item2 when I use axis=0? What happens when there is a hierarchical indexing in the columns and the rows? – r_31415 Dec 24 '12 at 20:55
  • Uhm... I already tried this and it 'flattens' the levels. Do you know a fix for this? – r_31415 Dec 24 '12 at 20:57
  • @RobertSmith What version of pandas are you using? It doesn't flatten for me on `0.10` (although `applymap` does). – Andy Hayden Dec 24 '12 at 21:29
  • Oh, right. I have 0.9 installed. So there is no flattening in 0.10 in any axis? – r_31415 Dec 24 '12 at 21:31
  • No, but I meant when there are several levels in columns and rows like this: f.index = [['level1', 'level1', 'level1'],['item1', 'item2', 'item2']] – r_31415 Dec 24 '12 at 21:41
  • I've edited my answer with a work-around for cases when the both axes are hierarchichal – Zelazny7 Dec 25 '12 at 00:08
  • Excellent. It's a bit disappointing that a work-around is needed but definitely useful. Thanks. – r_31415 Dec 25 '12 at 01:54