In pandas you can replace the default integer-based index with an index made up of any number of columns using set_index()
.
What confuses me, though, is when you would want to do this. Regardless of whether the series is a column or part of the index, you can filter values in the series using boolean indexing for columns, or xs() for rows. You can sort on the columns or index using either sort_values()
or sort_index()
.
The only real difference I've encountered is that indexes have issues when there are duplicate values, so it seems that using an index is more restrictive, if anything.
Why then, would I want to convert my columns into an index in Pandas?