0

incorrect sorting results for descending order

I tried sorting this dataset on the basis of release clause but it aint working.it should have shown top player like neymar or ronaldo with high release clauses but its showing some vague results.

Datasets-https://www.kaggle.com/karangadiya/fifa19/downloads/fifa19.zip/4

df=pd.read_csv('data.csv')
df1=df[['Name','Age','Overall','Release Clause']]
df1.sort_values(by='Release Clause',ascending=False,na_position='last').head()

expected:something like this

    Name                Age Overall Release Clause
0   L. Messi            31  94      €226.5M
1   Cristiano Ronaldo   33  94      €127.1M
2   Neymar Jr           26  92      €228.1M
3   De Gea              27  91      €138.6M
4   K. De Bruyne        27  91      €196.4M

actual output:

        Name        Age Overall Release Clause
1526    Léo Matos   32  76      €9M
3457    J. Windass  24  72      €9M
1419    Vieirinha   32  76      €9M
2519    P. Mpoku    26  74      €9M
4779    D. Geiger   20  70      €9M
Devesh Kumar Singh
  • 20,259
  • 5
  • 21
  • 40
shray
  • 11
  • 2

1 Answers1

2

My guess is that the Release Clause is stored as strings and so the sorting is done by lexicographic order ("€226.5M" < "€9M" returns True in Python).

Try to convert the Release Clause field to numbers (see Change data type of columns in Pandas) and it should work fine.

Loic
  • 46
  • 3