0

Based on the current schema I have been asked to find

Testing -- people who were untested and exposed to some one infectious -- Do not list anyone twice and do not list known sick people -- Exposed = at the same place, and overlap in time (No overlap time needed for simplicity)

From the query below I find my answer except I cannot remove the people who are 'postive' because the second part my query i.e the time lapse depends on the first part i.e the time the positive people went to the same locations.

select * from (
select DISTINCT person.PersonID, Register.LocID, Register.Checkin, Register.CheckOut
from person 
join Register on Person.PersonID = Register.PersonID 
join testing on person.PersonID  = testing.PersonID
where testing.Results is 'Positive' ) a 
join (
SELECT DISTINCT Person.PersonID, Register.LocID , Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID 
where person.PersonID  
not in (SELECT DISTINCT testing.PersonID from testing)) b on a.LocID = b.LocID 
and b.checkin >= a.CheckIn and b.CheckIn <= a.CheckOut

So my question is, What modification does this query need to show the results of the results of the second part only?

I consider the first part to be

select * from (
select DISTINCT person.PersonID, Register.LocID, Register.Checkin, Register.CheckOut
from person 
join Register on Person.PersonID = Register.PersonID 
join testing on person.PersonID  = testing.PersonID
where testing.Results is 'Positive' ) a 

And the second part to be

join (
SELECT DISTINCT Person.PersonID, Register.LocID , Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID 
where person.PersonID  
not in (SELECT DISTINCT testing.PersonID from testing)) b on a.LocID = b.LocID 
and b.checkin >= a.CheckIn and b.CheckIn <= a.CheckOut
Vij
  • 59
  • 5

3 Answers3

1

This is a complicated query. Because you do not want duplicates, I am going to suggest exists with the outer query just using persons.

The idea to get people in the same place at the same time is a self-join on register using both location and time overlaps. I think that is the most complex part of the query. The rest is checking if a person is or is not positive:

select p.*
from person p 
where not exists (select 1
                  from testing t
                  where t.personid = p.personId and t.results = 'positive'
                 ) and
      exists (select 1
              from register r1 join
                   register r2
                   on r1.locid = r2.locid and
                      r1.checkin < r2.checkout and
                      r2.checkout > r1.checkin join
                   testing t2
                   on r2.personid = t2.personid and
                      t2.results = 'positive' and
                      t2.timestamp < r2.checkout
              where r1.personid = p.personid
             );

The timing is a little tricky, but I think the timing makes sense. Someone needs to test positive before they are in the same place. Of course, you can remove the t2.timestamp < r2.checkout if there is no timing constraint.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you. The query came back with zero results (Could be my database). I am not sure why the result was such but I am not very clear on the idea of self join and exists hence I used 'not in'. I agree that is a very complex query. For me I cannot seem to shake it out of my head that it's not possible to get answer without showing the positive people because the time overlaps depends on the time of the positive people. I appreciate your help though as I know it's hard to given an answer without a built database. But since I use a lite version of sql I have trouble building one online. Thanks – Vij Aug 25 '20 at 11:06
  • 1
    @Vij . . . Can you set up a db<>fiddle? The issue may be the timing on the positive result relative to being in the same location. – Gordon Linoff Aug 25 '20 at 12:00
  • Hello Gordon, Thanks for your input. I managed to solve it, I tend to over complicate things and the question may have come out wrong from what I was trying to fin. As for db<>fiddle, I never knew this existed. I will look into it so that for future questions I can explain it better for you guys. – Vij Aug 25 '20 at 12:34
1

For readability you can create CTEs like this:

with
  -- returns all the untested persons
  untested as (select p.* from person p left join testing t on t.personid = p.personid where t.testingid is null), 
  --  returns all the infected persons
  infected as (select * from testing where results = 'Positive'),
  -- returns all the locids that infected persons visited and the start and dates of these visits
  loc_positive as (
    select r.locid, i.timestamp startdate, r.checkout enddate 
    from register r inner join infected i 
    on i.personid = r.personid and i.timestamp between r.checkin and r.checkout
  )
-- returns the distinct untested persons that visited the same locids with persons tested positive at the same time after they were tested 
select distinct u.*
from untested u 
inner join register r on r.personid = u.personid
inner join loc_positive lp on lp.locid = r.locid 
where lp.startdate <= r.checkout and lp.enddate >= r.checkin
forpas
  • 160,666
  • 10
  • 38
  • 76
  • Thank you. What I did not mention and my mistake, that is the untested person are the ones that are not present in the testing table. – Vij Aug 25 '20 at 12:35
  • 1
    This is what the `untested` cte returns. – forpas Aug 25 '20 at 12:36
  • Yes sorry, you're right. I use db browser for sql lite. If I put the query with ```untested as``` it returns with errors and I put in the query on it's own it comes in. – Vij Aug 25 '20 at 12:45
0

The solution to this answer was to add a distinct and a column name to the star in the first line.

select DISTINCT unt.PersonID from (
select  person.PersonID, Register.LocID, Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID join testing on person.PersonID  = testing.PersonID
where testing.Results is 'Positive' ) pos
join (
SELECT Person.PersonID, Register.LocID , Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID where person.PersonID  
not in (SELECT  testing.PersonID from testing)) unt on pos.LocID = unt.LocID 
and unt.checkin >= pos.CheckIn and unt.CheckIn <= pos.CheckOut;

Vij
  • 59
  • 5
  • 1
    `unt.checkin >= pos.CheckIn and unt.CheckIn <= pos.CheckOut` is not the correct way to test if an untested person was at the same time at the same place with an infected person. An untested person may have checked in before an infected person checked in and checked out after the infected person arrived. – forpas Aug 25 '20 at 13:27
  • You are correct. I tried adjust the query to ` and unt.checkin >= pos.CheckIn and unt.CheckIn <= pos.CheckOut and unt.CheckIn<= pos.CheckIn and unt.CheckOut <= pos.Checkout `. But this has come up zero results. Is it because of too many ands? – Vij Aug 26 '20 at 01:50
  • 1
    Did you try the query in my answer? The condition is: `where lp.startdate <= r.checkout and lp.enddate >= r.checkin` but `startdate` is the date that an infected person was tested positive. – forpas Aug 26 '20 at 12:11
  • Yes thank you. I was trying to add multiple clauses but yours is easier. – Vij Aug 26 '20 at 13:18