Questions tagged [pandas-apply]

Applies Python functions to rows or columns of a pandas dataframe, which may or may not result in aggregation.

Pandas apply available in DataFrame and Series classes is the equivalent of map in many functional languages like Haskell or Scala. It calls the function given in the argument for each element/row/column (depending on other parameters).

More detailed documentation can be found in:

170 questions
0
votes
2 answers

How to calculate a value grouped by one attribute, but provided in the second column in pandas

I have a datarame with Id of orders, Id Client,Date_order and some metrics (not to much important) I want to get number of last ID order of Client for all rows I tried this one: data=pd.DataFrame({'ID': [ 133853.0,155755.0,149331.0,337270.0, …
0
votes
1 answer

How to force Pandas apply to return all columns of parent dataframe?

After using groupby on certain columns of a dataframe, and subsequently using apply to test whether a string exists in another column, pandas only return those columns that were grouped by and the last column created with the apply. Is it possible…
homeStayProg
  • 49
  • 1
  • 9
0
votes
2 answers

Creating a new column from two columns using a dictionary in Pandas

I want to create a column based on a group and threshold for cutoff from another column for each group of the grouped column. The dataframe is below: df_in -> unique_id myvalue identif 0 CTA15 19.0 TOP 1 CTA15 …
Stan
  • 786
  • 1
  • 9
  • 25
0
votes
3 answers

How to use pandas apply function on rows without lambda?

I don't understand well how the apply function works. Here's my code which works fine: dftest = pd.DataFrame({'a': ['A BERTHOU'], 'b': ['BERTHOU']}) def test2(a, b): return a + b dftest['concat'] = dftest.apply(lambda row: test2(row['a'],…
loic_midy
  • 103
  • 4
0
votes
1 answer

concatenate the strings of multiple columns in multiple rows pandas?

I have two date frames as below: import pandas as pd df1 = pd.DataFrame({'serialNo':['aaaa','bbbb','cccc','ffff','aaaa','bbbb','aaaa'], 'Name':['Sayonti','Ruchi','Tony','Gowtam','Toffee','Tom','Sayonti'], 'testName': …
sayo
  • 207
  • 4
  • 18
0
votes
1 answer

Applying new row within second level index

I have a data frame that looks something like: +-----------+---------+-------+-------+-------+ | | | Day 1 | Day 2 | Day 3 | +-----------+---------+-------+-------+-------+ | Product 1 | Revenue | 0 | 0 | 0 | | …
NickP
  • 1,354
  • 1
  • 21
  • 51
0
votes
1 answer

Applying custom function with rolling window row-wise in pandas

I have a function that I want to apply to row-wise down the dataframe and output a new new column with the result. Normally this would be straightforward with a lambda function or .map() but I am stuck because the function requires a rolling min /…
Merv Merzoug
  • 1,149
  • 2
  • 19
  • 33
0
votes
1 answer

apply() method: Normalize the first column by the sum of the second

I'm having trouble understanding how a function works: """ the apply() method lets you apply an arbitrary function to the group result. The function take a DataFrame and returns a Pandas object (a df or series) or a scalar. For example: normalize…
0
votes
0 answers

pandas apply or aggregate

If I have a pandas data frame df, the following three methods to calculate the mean values of the columns will give the same result: import numpy as np df.mean(axis = 0) df.apply(np.mean) df.aggregate(np.mean) But what about if I create some…
Martin Alexandersson
  • 1,269
  • 10
  • 12
0
votes
2 answers

Do not map item to any output using apply()

Suppose I have a groupby object, a DataFrame, or anything else with an apply() method. I want some elements to not map to any output. For example, in my case I have a groupby and I want groups that satisfy a certain criteria to be ignored. How can I…
Bluefire
  • 13,519
  • 24
  • 74
  • 118
0
votes
2 answers

Filling each row of one column of a DataFrame with different values (a random distribution)

I have a DataFrame with aprox. 4 columns and 200 rows. I created a 5th column with null values: df['minutes'] = np.nan Then, I want to fill each row of this new column with random inverse log normal values. The code to generate 1 inverse log…
mrbTT
  • 1,399
  • 1
  • 18
  • 31
0
votes
0 answers

Apply custom function to multiple columns in Pandas dataframe's and returning the output in one column

I am converting this Excel formula into Pandas and Python. =IF(ISBLANK('Raw Data'!$A3),"",IF(OR(ISERROR((('Raw Data'!BY3+'Raw Data'!BZ3)*'Raw Data'!BX3*12)),'Raw Data'!BY3=0),'Raw Data'!CN3,(('Raw Data'!BY3+'Raw Data'!BZ3)*'Raw Data'!BX3*12))) This…
bewilderd
  • 11
  • 3
0
votes
1 answer

How to add leading 0's to different values in same column in pandas?

I have a column that has values of length 5,6,8, or 9. The column should just have values of length 6 or 9. I need to add leading 0's if the value is of length 5 or 8. There is another column which can identfy if the value should be 6 or 9 digits…
CandleWax
  • 2,159
  • 2
  • 28
  • 46
0
votes
1 answer

Create outliers column in pandas groupby DataFrame

I have a very large pandas DataFrame with several thousand codes and the cost associated with each one of them (sample): data = {'code': ['a', 'b', 'a', 'c', 'c', 'c', 'c'], 'cost': [10, 20, 100, 10, 10, 500, 10]} df = pd.DataFrame(data) I…
nvergos
  • 432
  • 3
  • 15
0
votes
3 answers

Changing column value by finding substring in string values

I'm trying to change the values in a single column using pandas apply(). My function partially worked, but I'm stuck on how to fix this other half. Data Column: County Name Riverside County San Diego County SanFrancisco County/city I'm trying to…
Glenn G.
  • 419
  • 3
  • 7
  • 18
1 2 3
11
12