i am trying to access a column in a dataframe and manipulate it and create new column in the data ftame

Question

x = onefile1['quiz1']
grading = []
for i in x :
    if i == '-':
        grading.append(0)

    elif float(i) < float(50.0):
        grading.append('lessthen50')

    elif i > 50.0 and i < 60.0:
        grading.append('between50to60')

    elif i > 60.0 and i < 70.0:
        grading.append('between60to70')


    elif i > 70.0 and i < 80.0:
        grading.append('between70to80')

    elif i  > 80.0:
        grading.append('morethen80')

    else:
        grading.append(0) 

onefile1 = file.reset_index()
onefile1['grade'] = grading

It is giving me the following error :

Length of values does not match length of inde

You might want to format your code so that it becomes more readable (check the preview), and please add some description to it other than the title. — edornd, Jan 15 '20 at 10:58
You probably have a value equal to 50 or 60 or 80. For instance, you can use `<=` instead of `<`. — E. Zeytinci, Jan 15 '20 at 11:05
could you provide full code, and also cross check that the reset_index(inplace=True) function is called on dataframe object only — The Guy, Jan 15 '20 at 11:25

E. Zeytinci · Accepted Answer · 2020-01-15T12:34:01.920

You probably have a value equal to 50, 60 or 70 etc. You can use <= instead of < or cut from pandas,

import numpy as np
import pandas as pd

onefile1['quiz1'] = (onefile1['quiz1']
                        .astype(str).str.replace('-', '0')
                        .astype(float))

labels = [
    0, 'lessthen50', 'between50to60', 
    'between60to70', 'between70to80', 'morethen80'
]

bins = [-1, 0, 50, 60, 70, 80, np.inf]
onefile1['grade'] = pd.cut(
    onefile1.quiz1, bins=bins, 
    labels=labels, include_lowest=True)

Here is an example,

>>> import numpy as np
>>> import pandas as pd
>>> onefile1 = pd.DataFrame({'quiz1': [0, 40, 30, 60, 80, 100, '-']})
>>> onefile1['quiz1'] = (onefile1['quiz1']
                        .astype(str).str.replace('-', '0')
                        .astype(float))
>>> labels = [
    0, 'lessthen50', 'between50to60',
    'between60to70', 'between70to80', 'morethen80'
]
>>> bins = [-1, 0, 50, 60, 70, 80, np.inf]
>>> onefile1['grade'] = pd.cut(
    onefile1.quiz1, bins=bins,
    labels=labels, include_lowest=True)
>>> onefile1
   quiz1          grade
0    0.0              0
1   40.0     lessthen50
2   30.0     lessthen50
3   60.0  between50to60
4   80.0  between70to80
5  100.0     morethen80
6    0.0              0

PS: It is a good idea to check the parameters include_lowest and right before use.

i am trying to access a column in a dataframe and manipulate it and create new column in the data ftame

1 Answers1