0

I'm a student from Moscow State University and I'm doing a small research about suburban railroads. I crawled information from wikipedia about all stations in Moscow region and now I need to subset those, that are Moscow Central Diameter 1 (railway line) station. I have a list of Diameter 1 stations (d1_names) and what I'm trying to do is to subset from whole dataframe (suburban_rail) with isin pandas method. The problem is it returns only 2 stations (the first one and the last one), though I'm pretty sure there are some more, because using str.contains with absent stations returns what I was looking for (so they are in dataframe). I've already checked spelling and tried to apply strip() to each element of both dataframe and stations' list. Attached several screenshots of my code.

suburban_rail dataframe

stations' list I use to subset

what isin returns

checking manually for Bakovka station

checking manually for Nemchinovka station

Thanks in advance!

mieltn
  • 29
  • 5
  • 3
    [You should not post code/sample data as an image because:...](https://meta.stackoverflow.com/a/285557/1422451) – Parfait Oct 28 '20 at 20:36

1 Answers1

0

Next time provide a minimal reproducible example, such as the one below:

suburban_rail = pd.DataFrame({'station_name': ['a','b','c','d'], 'latitude': [1,2,3,4], 'longitude': [10,20,30,40]})
d1_names = pd.Series(['a','c','d'])

suburban_rail

    station_name    latitude    longitude
0   a               1           10
1   b               2           20
2   c               3           30
3   d               4           40

Now, to answer your question: using .loc the problem is solved:

suburban_rail.loc[suburban_rail.station_name.isin(d1_names)]

    station_name    latitude    longitude
0   a               1           10
2   c               3           30
3   d               4           40
jlb_gouveia
  • 603
  • 3
  • 11