0

I have looked high and low and tried so many different codes from this site to help me with my problem. Maybe someone can make a suggestion?

I have a dataframe that looks like this: image of my dataframe

I hope that table came out right. I'm a newbie to Stack Overflow so sorry if it didn't come out right. I have struggled with this for hours. I managed to finally show my Total row at the bottom, but I don't want the NaN to show in the one column that has strings in it. Can someone tell me what on EARTH does it take to simply remove NaN's from ONE CELL in this dataframe? I'm at my wits end.

Ch3steR
  • 20,090
  • 4
  • 28
  • 58
KMW
  • 37
  • 1
  • 6
  • SEE? I can't even get my dataframe table to show in my question. I don't even know how to do that simple task. I took a screen shot of my crappy dataframe and dragged the image where it told me and it doesn't even show up!!!???!!! – KMW Jan 20 '21 at 07:30
  • 1
    You cannot literally remove it but you can replace `NaN` with an empty string `''` – Ch3steR Jan 20 '21 at 07:30

2 Answers2

1

You can use fillna to fill NaNs with another value, e.g., an empty string:

df['Gender'].fillna('', inplace=True)

Or, if you prefer with 'Other/Not Disclosed':

df['Gender'].fillna('Other/Not Disclosed', inplace=True)

In both cases, when you print the DataFrame, NaN will be not present anymore.

There are other ways to handle NaN or missing values; you can take a look here for more information.

PieCot
  • 3,564
  • 1
  • 12
  • 20
  • Thank you! I'm not sure why this did not work for me. The other one did. Might have something to do with my syntax...seems like it should have worked for me. The only line I needed to add was the df.loc['Total', 'gender'] = '' – KMW Jan 20 '21 at 19:44
  • @KMW It's quite strange... BTW, I'm happy you've solved your problem – PieCot Jan 20 '21 at 21:57
1

One of possible solutions (including creation of the dataframe):

import pandas as pd
import numpy as np

# create base of the dataframe
df = pd.DataFrame({'gender':['male', 'female', 'others'], 'total':[484, 81, 11]})
# calculate percentage column
df['percentage'] = round(df['total']/df['total'].sum(), 2)
# create SUM row
df.loc['TOTAL'] = df.select_dtypes(np.number).sum()
# replace string column 'gender' with empty string
df.loc['TOTAL', 'gender'] = ''

Result:

        gender  total   percentage
0       male    484.0   0.84
1       female  81.0    0.14
2       others  11.0    0.02
TOTAL           576.0   1.00
Lukas
  • 2,034
  • 19
  • 27