-1

I have some code that fills two lists using a function, called function, that returns two values. The function requires some parameters that are located within a row of the dataframe. Then I append the lists to a new column in my dataframe.

data = [[45, 'F', 'Jill', 'USA'], [87, 'm', 'Jeff', 'Poland'], [99, 'M', 'Tim', 'Peru']]

df = pd.DataFrame(data, ['Age', 'Sex', 'Name', 'Location']


new_column1 = []
new_column2 = []

for member in tqdm(range(len(df))):
        list1, list2 = (function(df['Age'][member], df['Sex'][member], df['Name'][member], df['Location'][member]))
        
        new_column1.append(list1)
        new_column2.append(list2)

I am wondering if there is a faster way to do this using apply. I threw in tqdm because everyone likes to know how long they have to wait. For what it's worth, the output of the function is a float and a list.

Is there a better way to do this? I get the feeling that this is a little basic and I want something elegant and efficient. Is there a way to do this using apply? I would like to eventually use the swifter package at some point.

Update

I do not understand why this doesn't work.

df[['New_column1', 'New_column2']] = df[['Age', 'Sex', 'Name', 'Location']].swifter.applymap(function)

I am getting an error that the function is missing 3 required positional arguments: 'Sex', 'Name', and 'Location'.

Progress
  • 117
  • 1
  • 9

1 Answers1

1

Is this what you are looking for?

import pandas as pd
df = pd.DataFrame({'Name':['Alice','Bob'],'Age':[20,19],'Sex':['F','M'],'Location':['Berlin','San Sebastian']})

nested_list = df.values.tolist() #each pd row into list 

list1, list2 = map(list, nested_list)# nested list into separate lists

print(df,'\n')
>>    Name  Age Sex       Location
>>0  Alice   20   F         Berlin
>>1    Bob   19   M  San Sebastian 


print(f'{list1=}')
>>list1=['Alice', 20, 'F', 'Berlin']

print(f'{list2=}')
>>list2=['Bob', 19, 'M', 'San Sebastian']

Edit

import pandas as pd
import swifter

df = pd.DataFrame({'Name':['Alice','Bob'],'Age':[20,19],'Sex':['F','M'],'Location':['Berlin','San Sebastian']})

list1, list2 = map(list, df.swifter.apply(list,axis = 1))

print(df,'\n')

print(f'{list1=}')

print(f'{list2=}')

enter image description here

RSale
  • 463
  • 5
  • 14
  • Kinda, I would love to bale to use apply or lambda to pass the values into the function. Because I want to use swifter to use multiple processors or speed it up somehow. – Progress Jan 14 '22 at 00:21
  • Is the edited version what you are looking for? – RSale Jan 14 '22 at 14:54
  • Not quite, but I really appreciate you sticking with me! The goal is not simply convert them to lists, but to put values in a row into a function and grow two list simultaneously. Although this idea may not be the most efficient or elegant. The dataset I use will grow rapidly and I want to make sure this script will work for years to come. I read that for loops are not efficient so I am trying to think of ways to get rid of it. – Progress Jan 14 '22 at 15:13
  • I am looking for something like `df['outCol'] = df[['inCol1', 'inCol2']].swifter.apply(my_func)` but the function returns two values, not one. – Progress Jan 14 '22 at 15:20