0

I have dataset which contains multiple countries. How can I filter it so that it contains only specific countries?

enter image description here

For example now it contains UK, Belgium, France, ...etc

I would like to filter it so that it shows only France and Belgium.

So far I have tried that:

dataset = dataset.loc[dataset.Country == "France"].copy()
dataset.head()

and it works, because it filters only the data for France, but if I add Belgium

dataset = dataset.loc[dataset.Country == "France","Belgium"].copy()
dataset.head()

It doesn't work any more. I get the following error:

'the label [Belgium] is not in the [columns]'

Any help will be highly appreciated.

Dakata
  • 1,227
  • 2
  • 14
  • 33
  • 2
    you want `dataset = dataset[dataset['Country'].isin([ "France","Belgium"])].copy()` what you tried is looking for a column `Belgium` which doesn't exist, the param after the comma looks for a column – EdChum Jan 10 '19 at 15:44
  • Would something like `dataset = dataset.loc[dataset.Country == "France" or dataset.Country == "Belgium"].copy()` work? It's been a long time since I used pandas. – TuanDT Jan 10 '19 at 15:45

1 Answers1

2

what you tried failed because it's treating 'Belgium' as a column to look for, which doesn't exist. If you want to filter against multiple values then use isin:

dataset = dataset[dataset['Country'].isin([ "France","Belgium"])].copy()

when you use loc the param after the comma is treated as the label to look for, in this case in the column axis

cs95
  • 379,657
  • 97
  • 704
  • 746
EdChum
  • 376,765
  • 198
  • 813
  • 562