Stopping a forward fill based on a different string column changing in the DF

Question

i'm new to Python so go easy!

I have a Dataframe like the following. I would like to forward fill the NaN's in the shares_owned column but stop when the string in df['ticker'] changes. And only start when another number appears in shares_owned again.

date	ticker	shares_owned	price
01/01/2020	EZY	NaN	£2
02/01/2020	EZY	10	£2.1
03/01/2020	EZY	NaN	£2.12
04/01/2020	EZY	NaN	£12.5
01/01/2020	FTSE	NaN	£11
02/01/2020	FTSE	NaN	£12
03/01/2020	FTSE	2	£12.5
04/01/2020	FTSE	NaN	£12.5

For example, the output table would look like this:

date	ticker	shares_owned	price
01/01/2020	EZY	NaN	£2
02/01/2020	EZY	10	£2.1
03/01/2020	EZY	10	£2.12
04/01/2020	EZY	10	£12.5
01/01/2020	FTSE	NaN	£11
02/01/2020	FTSE	NaN	£12
03/01/2020	FTSE	2	£12.5
04/01/2020	FTSE	2	£12.5

So far I have been trying to use the .fillna(method='ffill') to no avail.

Hi @Barnaby, so since you're new, you should read up on providing a [minimum reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) meaning you should provide enough here to show that you've worked the problem, a starting point for your solution, what you expected, and what you got instead. Errors are gold. — mayosten, Feb 19 '21 at 16:38
You can group the dataframe on `ticker` then `ffill` the `shares_column` per group: `df['shares_owned'] = df.groupby('ticker')['shares_owned'].ffill()`. — Shubham Sharma, Feb 19 '21 at 16:40
And this could be helpful... looks like they're trying to do the same thing: https://stackoverflow.com/questions/58181262/groupby-with-ffill-deletes-group-and-does-not-put-group-in-index — mayosten, Feb 19 '21 at 16:42

score 3 · Accepted Answer · answered Feb 19 '21 at 16:42

you note groups, hence groupby() does the grouping
within the group fillna(method="fill") within a transform()

df = pd.read_csv(io.StringIO("""date    ticker  shares_owned    price
01/01/2020  EZY NaN £2
02/01/2020  EZY 10  £2.1
03/01/2020  EZY NaN £2.12
04/01/2020  EZY NaN £12.5
01/01/2020  FTSE    NaN £11
02/01/2020  FTSE    NaN £12
03/01/2020  FTSE    2   £12.5
04/01/2020  FTSE    NaN £12.5"""), sep="\t")

df["shares_owned"] = df.groupby("ticker")["shares_owned"].transform(lambda s: s.fillna(method="ffill"))

output

	date	ticker	shares_owned	price
0	01/01/2020	EZY	nan	£2
1	02/01/2020	EZY	10	£2.1
2	03/01/2020	EZY	10	£2.12
3	04/01/2020	EZY	10	£12.5
4	01/01/2020	FTSE	nan	£11
5	02/01/2020	FTSE	nan	£12
6	03/01/2020	FTSE	2	£12.5
7	04/01/2020	FTSE	2	£12.5

You don't need `transform` with `lambda`, we can use `ffill` directly here :) — Shubham Sharma, Feb 19 '21 at 16:50

Stopping a forward fill based on a different string column changing in the DF

1 Answers1

output