How to fill missing values for apple variety
from the same column when there are 1-4 varieties per farm and but cannot be two varieties with the same ripening
index on the same farm? Assume the column has all possible scenarios.
For instance, in the below sample, 'Empire' and 'Honeycrisp' have the same ripening
but they are from the different farms.
A sample df
(a part of a larger dataframe):
df = pd.DataFrame(
{'farm': [419,382, 382, 382, 411, 411, 411],
'variety': ['Gala', 'Gala', 'Empire', '', 'Honeycrisp', '', 'Fuji'],
'ripening':[2,2,3,3,3,3,6],
'D': np.random.randn(7)*10,
'E': list('abcdefg')
}
)
df
Out[223]:
farm variety ripening D E
0 419 Gala 2 12.921246 a
1 382 Gala 2 -2.776150 b
2 382 Empire 3 3.551226 c
3 382 3 2.715187 d
4 411 Honeycrisp 3 -13.557640 e
5 411 3 -11.525100 f
6 411 Fuji 6 -3.660661 g
my desired output:
farm variety ripening D E
0 419 Gala 2 12.921246 a
1 382 Gala 2 -2.776150 b
2 382 Empire 3 3.551226 c
3 382 Empire 3 2.715187 d
4 411 Honeycrisp 3 -13.557640 e
5 411 Honeycrisp 3 -11.525100 f
6 411 Fuji 6 -3.660661 g