I have a pandas dataframe below,
data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(data)
Here df is a Pandas dataframe.
I am trying to convert this dataframe to pandas API on spark
import pyspark.pandas as ps
pdf = ps.from_pandas(df)
print(type(pdf))
Now the dataframe type is '<class 'pyspark.pandas.frame.DataFrame'> ' No I am applying group by function on pdf like below,
for i,j in pdf.groupby("Team"):
print(i)
print(j)
I am getting an error below like
KeyError: (0,)
Not sure this functionality will work on pandas API on spark ?