Is it possible to pass a for loop in a function?

Question

I know it sounds ridiculous but I have to pass a for loop into a function. I have a dataframe with 75+ columns and most of them are categorical variables. One of the variable is called SalePrice and i wish to find the correlation between the categorical variables and SalePrice.

This is my code, but i think it is ridiculous to go through all 75 columns manually. Is there a easy way?

df = pd.read_csv(file, delimiter=',')
qualityTest = df[["OverallQual","SalePrice"]]
qualities = [1,2,3,4,5,6,7,8,9,10]
stats.f_oneway(qualityTest['SalePrice'][qualityTest['OverallQual'] == 1],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 2],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 3],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 4],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 5],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 6],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 7],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 8],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 9],
              qualityTest['SalePrice'][qualityTest['OverallQual'] == 10])

I've tried doing this but it doesnt work

stats.f_oneway(
    for i in qualities:
        qualityTest['SalePrice'][qualityTest['OverallQual'] == i]
)

Green Cloak Guy · Accepted Answer · 2019-07-01T13:08:35.167

5

You can use a list comprehension - essentially, create a list using a for loop, and pass that in:

stats.f_oneway([qualityTest['salePrice'][qualityTest['OverallQual'] == i] for i in qualities])

Or if you want it passed as i separate arguments instead of as one list with i elements, you can add an * right in front of the outermost set of square brackets (which will unpack the list you just made into function arguments).

edited Jul 01 '19 at 13:08

answered Jul 01 '19 at 03:41

Green Cloak Guy

23,793
4
33
53

Hi, I've tried both ways and it returned me with an error. ValueError: setting an array element with a sequence. (For list) TypeError: float() argument must be a string or a number, not 'generator' (After removing the outer most brackets) – Aaron Jul 01 '19 at 03:44
3

Removing the outermost square brackets will pass a generator. If you want it the arguments to separate, you must use the `*` unpacking operator. Something like: `stats.f_oneway(*(qualityTest['salePrice'][qualityTest['OverallQual'] == i] for i in qualities))` – iz_ Jul 01 '19 at 03:44
@Tomothy32 Thanks for that, I've updated my answer. Incidentally, why *doesn't* a generator work here, when it works for things like `print()` and the other stdlib functions that take an arbitrary number of arguments? – Green Cloak Guy Jul 01 '19 at 13:09
1

Could you give an example? Trying `print(i for i in range(5))` yields ` at *memory address*>`. Maybe you meant unpacking an generator, like `print(*(i for i in range(5)))`. – iz_ Jul 02 '19 at 15:54

score 3 · Answer 2 · answered Jul 01 '19 at 03:44

3

Using groupby here

qualityTest.groupby('OverallQual').OverallQual.apply(stats.f_oneway)

answered Jul 01 '19 at 03:44

BENY

317,841
20
164
234

Is it possible to pass a for loop in a function?

2 Answers2