0

I have a potential pandas bug, or maybe I've just been staring at this too long. I have not had issues using xs on a multi index before. Code is bellow and I've verified that the error occurs on both Python 2 on pandas version 0.16.2 and Python 3 running pandas version 0.17.0:

In [1]:

import sys
​
if sys.version[0] == '2':
    from StringIO import StringIO
if sys.version[0] == '3':
    from io import StringIO

import pandas as pd
​
sstring = """\
m,p,tstep,value,jday,normed_value,datetime
6,407,0,1,564.5,5.75,1964-07-18 12:00:00
6,407,0,1,564.5,5.75,1964-07-18 12:00:00
7,407,0,1,564.5,5.75,1964-07-18 12:00:00
8,408,0,1,564.5,6.75,1964-07-18 12:00:00
"""
​
subset = pd.read_csv(StringIO(sstring),
                     index_col=['m', 'p'],
                     parse_dates=['datetime'])
​
subset.xs(6, level='m')

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-dbc4b09ce656> in <module>()
     21 print(subset)
     22 
---> 23 subset.xs(6, level='m')

C:\Anaconda\lib\site-packages\pandas\core\generic.pyc in xs(self, key, axis, level, copy, drop_level)
   1458 
   1459             result = self.ix[indexer]
-> 1460             setattr(result, result._get_axis_name(axis), new_ax)
   1461             return result
   1462 

C:\Anaconda\lib\site-packages\pandas\core\generic.pyc in __setattr__(self, name, value)
   2159         try:
   2160             object.__getattribute__(self, name)
-> 2161             return object.__setattr__(self, name, value)
   2162         except AttributeError:
   2163             pass

pandas\src\properties.pyx in pandas.lib.AxisProperty.__set__ (pandas\lib.c:42548)()

C:\Anaconda\lib\site-packages\pandas\core\generic.pyc in _set_axis(self, axis, labels)
    411 
    412     def _set_axis(self, axis, labels):
--> 413         self._data.set_axis(axis, labels)
    414         self._clear_item_cache()
    415 

C:\Anaconda\lib\site-packages\pandas\core\internals.pyc in set_axis(self, axis, new_labels)
   2217         if new_len != old_len:
   2218             raise ValueError('Length mismatch: Expected axis has %d elements, '
-> 2219                              'new values have %d elements' % (old_len, new_len))
   2220 
   2221         self.axes[axis] = new_labels

ValueError: Length mismatch: Expected axis has 4 elements, new values have 2 elements

However, not specifying a level works as does using .loc as seen here:

In [16]:

print(subset.xs(6))
print(subset.loc[6])
     tstep  value   jday  normed_value            datetime
p                                                         
407      0      1  564.5          5.75 1964-07-18 12:00:00
407      0      1  564.5          5.75 1964-07-18 12:00:00
     tstep  value   jday  normed_value            datetime
p                                                         
407      0      1  564.5          5.75 1964-07-18 12:00:00
407      0      1  564.5          5.75 1964-07-18 12:00:00

Does any one have some insight on this behavior?

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
htln
  • 1
  • 1
  • Seems like a bug to me, possibly something to do with the duplicates in the index (better to report it at https://github.com/pydata/pandas/issues), and as you said already yourself, even simpler to use the working alternative `subset.loc[6]` – joris Oct 27 '15 at 22:30
  • Thanks, @joris. I'll open a bug in a couple days if no one replies. – htln Oct 28 '15 at 22:55
  • Hi, this still seems to be an issue and I couldn't find the bug report. Could you please post the link? Thanks – user2689410 Jul 15 '16 at 06:55
  • @user2689410 Ouch this was a long time ago. I couldn't find if I filed a report back then. I've filed https://github.com/pydata/pandas/issues/13719 with multiple examples. – htln Jul 20 '16 at 16:07

1 Answers1

0

Until the following issue (https://github.com/pydata/pandas/issues/13719) is closed the following is a fix:

subset.xs((6,), level=['m'])

htln
  • 1
  • 1