How to rename the index of a Dask Dataframe

Question

How would I go about renaming the index on a dask dataframe? I tried it like so

df.index.name = 'foo'

but rechecking df.index.name shows it still being whatever it was previously.

score 7 · Accepted Answer · answered Jun 04 '17 at 18:34

7

This does not seem like an efficient way to do it, so I wouldn't be surprised if there is something more direct.

d.index.name starts off as 'foo';

def f(df, name):
    df.index.name = name
    return df

d.map_partitions(f, 'pow')

The output now has index name of 'pow'. If this is done with the threaded scheduler, I think you also change the index name of d in-place (in which case you don't really need the output of map_partitions).

answered Jun 04 '17 at 18:34

mdurant

27,272
5
45
74

Adding: this strategy can also be applied to rename a Dask Series, just by removing the `.index` from `f` function. – paulochf Nov 24 '17 at 17:53
This seems off to me. This generates dask delayed tasks for something that should obviously be immediate. https://github.com/dask/dask/issues/4950 – stav Jun 17 '19 at 12:59
In dask-world, when to use `compute()` is up to the user. It may be best to combine with other operations. – mdurant Jun 17 '19 at 13:03

score 3 · Answer 2 · answered Jun 12 '20 at 19:04

3

A bit late, but the following functions:

    import dask.dataframe as dd
    import pandas as pd
    df = pd.DataFrame().assign(s=[1, 2], o=[3, 4], p=[5, 6]).set_index("si")
    ddf = dd.from_pandas(df, npartitions=2)
    ddf.index = ddf.index.rename("si2")

I hope this can help someone else out!

answered Jun 12 '20 at 19:04

Luca Venturini

39
2

Just like for the OP, this didn't actually change the name of the index when I tried it. – Thrastylon Oct 28 '21 at 09:59

How to rename the index of a Dask Dataframe

2 Answers2

Linked