5

this is my code:

for col in df:
    if col.startswith('event'):
        df[col].fillna(0, inplace=True)
        df[col] = df[col].map(lambda x: re.sub("\D","",str(x)))

I have 0 to 10 event column "event_0, event_1,..." When I fill nan with this code it fills all nan cells under all event columns to 0 but it does not change event_0 which is the first column of that selection and it is also filled by nan.

I made these columns from 'events' column with following code:

event_seperator = lambda x: pd.Series([i for i in 
str(x).strip().split('\n')]).add_prefix('event_')
df_events = df['events'].apply(event_seperator)
df = pd.concat([df.drop(columns=['events']), df_events], axis=1)

enter image description here

Please tell me what is wrong? you can see dataframe before changing in the picture.

jpp
  • 159,742
  • 34
  • 281
  • 339
NilZ
  • 71
  • 1
  • 1
  • 4
  • Are you sure those `nan` values in `event_0` are null and not the string `'nan'`? – jpp Oct 16 '18 at 13:36
  • Apparently event_0 values was **nan** while under others was **NaN**. I don't know why that happened since I made all those columns the same. So my solution is now: `for col in df: if col.startswith('event'): df[col] = df[col].map(lambda x: re.sub("\D","",str(x))) df[col] = df[col].replace('', np.nan) df[col].fillna(0, inplace=True)` – NilZ Oct 17 '18 at 06:51

1 Answers1

6

I don't know why that happened since I made all those columns the same.

Your data suggests this is precisely what has not been done.

You have a few options depending on what you are trying to achieve.

1. Convert all non-numeric values to 0

Use pd.to_numeric with errors='coerce':

df[col] = pd.to_numeric(df[col], errors='coerce').fillna(0)

2. Replace either string ('nan') or null (NaN) values with 0

Use pd.Series.replace followed by the previous method:

df[col] = df[col].replace('nan', np.nan).fillna(0)
jpp
  • 159,742
  • 34
  • 281
  • 339