Map Function Returns Unexpected Result Upon Second Execution

Question

There is a CSV which reads as follows:

bike_sharing = pd.read_csv("BIKE_SHARING_ASSIGNMENT\day.csv")
bike_sharing.yr

The yr has 2 possible values: 0 and 1. I want to update the collection and map them to 2018 and 2019 respectively. I currently doing it as follows:

bike_sharing ['yr'] = bike_sharing[['yr']].apply(lambda x: x.map({0:'2018',1:'2019'}) )
bike_sharing ['yr'].value_counts()

I get correct results the first time, but when I run it the second time, it changes all values to NAN. Why does this happen?

Welcome to StackOverflow! :) – Kuba hasn't forgotten Monica May 22 '21 at 12:58 — Kuba hasn't forgotten Monica, May 22 '21 at 12:58

score 1 · Answer 1 · answered May 22 '21 at 12:52

The first time the map runs on yr, it faces input values of 0 and 1, and your translation dictionary {0:'2018', 1:'2019'} handles those.

The second time, it faces input values of 2018 and 2019, and there're no entries in the dictionary for those items. Thus they get dropped and turned to NANs.

This is documented - see Series.map docs.

Instead, you should use a method that doesn't drop items not in the map. That one is Series.replace - see also this question:

bike_sharing['yr] = bike_sharing[['yr']].apply(lambda x: x.replace({0:'2018',1:'2019'}))

You could also do an in-place replacement that is less verbose:

bike_sharing[['yr']].replace({0:'2018', 1:'2019'}, inplace=True)

score 0 · Answer 2 · answered May 23 '21 at 04:13

0

In this case, you could simply add 2018 to the year value and no map is required.

answered May 23 '21 at 04:13

Carlos Melus

1,472
2
7
12

Map Function Returns Unexpected Result Upon Second Execution

2 Answers2