1

I am working with a dataframe like this:

import pandas as pd
import datetime
records = [{'Name':'John', 'Start':'2020-01-01','Stop':'2020-03-31'}, {'Name':'John', 'Start':'2020-04-01','Stop':'2020-12-31'}, 
       {'Name':'Mary', 'Start':'2020-01-01','Stop':'2020-03-15'}, {'Name':'Mary', 'Start':'2020-03-16','Stop':'2020-03-31'}, 
       {'Name':'Mary', 'Start':'2020-04-01','Stop':'2020-12-31'}, {'Name':'Stan', 'Start':'2020-02-01','Stop':'2020-03-31'},
       {'Name':'Stan', 'Start':'2020-04-01','Stop':'2020-12-31'}]
df = pd.DataFrame(records)
df['Start'] = pd.to_datetime(df['Start'])
df['Stop'] = pd.to_datetime(df['Stop'])
df

which gives the output

Name         Start       Stop
0   John    2020-01-01  2020-03-31
1   John    2020-04-01  2020-12-31
2   Mary    2020-01-01  2020-03-15
3   Mary    2020-03-16  2020-03-31
4   Mary    2020-04-01  2020-12-31
5   Stan    2020-02-01  2020-03-31
6   Stan    2020-04-01  2020-12-31

What I want to do is select all the records for all the individuals who have a start date of 2020-01-01. That is, if someone doesn't have a record beginning on 1/1, then I don't want any of their records. The results should give me this:

    Name    Start   Stop
0   John    2020-01-01  2020-03-31
1   John    2020-04-01  2020-12-31
2   Mary    2020-01-01  2020-03-15
3   Mary    2020-03-16  2020-03-31
4   Mary    2020-04-01  2020-12-31

There should be no records for Stan in the output, because none of his entries start with 2020-01-01. Any ideas on how to accomplish this? Thanks!

Sean R
  • 173
  • 1
  • 8
  • Related : [Pandas: remove group from the data when a value in the group meets a required condition](https://stackoverflow.com/questions/34690756/pandas-remove-group-from-the-data-when-a-value-in-the-group-meets-a-required-co) – anky Apr 05 '21 at 16:30

1 Answers1

1

Try the condition grouped by transform:

df[df['Start'].eq("2020-01-01").groupby(df["Name"]).transform('any')]

   Name      Start       Stop
0  John 2020-01-01 2020-03-31
1  John 2020-04-01 2020-12-31
2  Mary 2020-01-01 2020-03-15
3  Mary 2020-03-16 2020-03-31
4  Mary 2020-04-01 2020-12-31
anky
  • 74,114
  • 11
  • 41
  • 70
  • Works perfectly for my example data but doesn't work for my real data for some reason. It just returns an empty dataframe with the column headers. I'll try and diagnose. Thanks! – Sean R Apr 05 '21 at 17:05