I have a dataframe with multiple person answering multiple questions. Questions are operationalized as 1=agree and 0=not agree. The same person has answered multiple question and now I want to find out the percentage of agree statement, hence 1, compared to the totality of answers. The dataframe is organized that there is one row per question. People answer 8 questions each, so we have 8 rows for every person. I would like to calculate the percentage of "agree" (or 1) statement for every person, compared to the totality of the questions every single person answered (hence 8).
Asked
Active
Viewed 688 times
2 Answers
0
# display how the targets are distributed
def configure_target_statistic(targets):
trg_cnt = targets.value_counts()
labels, sizes = (np.array(trg_cnt.index)), (np.array(100*(trg_cnt/trg_cnt.sum())))
py.iplot(go.Figure(data=[go.Pie(labels=labels, values=sizes)], layout=go.Layout(title='Target Distribution',font=dict(size=15),width=500, height=500)))
return trg_cnt
configure_target_statistic(df['answers'])
you need only imports, this should be enough:
import numpy as np
import pandas as pd
import plotly.offline as py
import plotly.graph_objs as go

Devi Khositashvili
- 556
- 2
- 13
0
Assuming your dataframe has two columns user_id and question_id, by which you identify each row, here is a simple solution:
import pandas as pd
df=pd.DataFrame([[1,6,1],[1,7,1],[2,6, 1],[2,7, 0]],columns= ['user_id','question_id','agree'])
grp=df.groupby(['user_id'])['agree']
print(100*grp.sum()/grp.count())
In the code(the last line), I am only considering the number of questions a user has attempted for calculating the percentage.

user3774410
- 16
- 4