How do I determine if a column value has changed back and forth during a date range?

Question

Background:

We have a working query for selecting what type of access (d8_5_B.xpostaccessuse) a patient (column xpid in any table) has in use 90 days after the date they became chronic (d1.ddiadate +'90 days'). Output is two columns: one date, one integer.

Query:

SELECT
    d8_B.start_date,
    d8_5_B.xpostaccessuse
FROM d1 
    LEFT JOIN d8 d8_B 
        ON d1.xpid = d8_B.xpid
    LEFT JOIN d8_5 d8_5_B 
        ON d8_B.xsession_id = d8_5_B.xsession
WHERE
    d8_B.start_date IN (
        SELECT MAX(d8.start_date)
        FROM d8
            LEFT JOIN d1 
                ON d8.xpid = d1.xpid 
        WHERE d8.start_date <= d1.ddiadate + '90 days'
        GROUP BY d8.xpid
    )

Goal:

We need to be able to determine whether or not their access changed back and forth during this 90 day period. Ideally we would like to also be able to highlight what it switched to and when, but smaller victories are still helpful.

We know d8_5_B.xpostaccessuse = 3 on d1.ddiadate.
We know d8_5_B.xpostaccessuse = 3 on d1.ddiadate +'90 days'.

But if their access changed from 3 to say 5 and back to 3 again during this 90 day range, we have no way to tell that it was not 3 all along.

How can we determine:

if the value changed from 3 to something else and back to 3 during this 90 days?
what it changed to, and when?

Work so far:

Our best thoughts so far involve the use of the count() aggregate to count the total rows in the date range, then count the rows where d8_5_B.xpostaccessuse = 3 and compare the results. This would at tell us if there is a discontinuity, even if not what it is. We have been unable to get this approach to work, primarily due to the required group by.

The other idea we had was trying to select the distinct values d8_5_B.xpostaccessuse and group by xpid, but the same issue arises with the group by.

Updates:

Ingres 9
Open to suggestions for the "what and when" portion. I was picturing maybe a column for each, what it changed to in the middle and on what date this change started. This obviously gets tricky if a patient had their access change multiple times, but this shouldn't happen often. For now, reporting an extra row would be acceptable unless someone thinks of a more elegant solution.

Sample Table (credit to G Jones below):

create table d1
(xpid integer, ddiadate ingresdate);

create table d8
(xpid integer, xsession_id integer, start_date ingresdate);

create table d8_5
(xpid integer, xsession integer, xpostaccessuse integer);

insert into d1 values(1, '2018-01-01');
insert into d1 values(2, '2018-01-01');

insert into d8 values(1, 1, '2018-01-01');
insert into d8 values(1, 2, '2018-01-10');
insert into d8 values(1, 3, '2018-01-20');

insert into d8 values(2, 1, '2018-01-01');
insert into d8 values(2, 2, '2018-01-10');
insert into d8 values(2, 3, '2018-01-20');

insert into d8_5 values(1, 1, 3);
insert into d8_5 values(1, 2, 3);
insert into d8_5 values(1, 3, 3);

insert into d8_5 values(2, 1, 3);
insert into d8_5 values(2, 2, 5);
insert into d8_5 values(2, 3, 3);

Table defintions as they pertain to this task:

d1: Demographics Table/Basic Patient Info
- Patient ID & table index (d1.xpid)
- Date First Chronic (d1.ddiadate)
d8: Session Table
- Session ID & table index (d8.xsession_id)
- Patient ID (d8.xpid)
- Session Start DateTime (d8.start_date)
d8_5: Session Subtable with Access Data
- Session ID & table index (d8_5.xsession)
- Patient ID (d8_5.xpid)
- Access Type (d8_5.xpostaccessuse)
Note that all three tables can be joined on xpid and d8.xsession_id = d8_5.xsession

BTW, are you really on Ingres ? v11 ? (I may go on pure SQL92 as I don't have any DB to try my query on...) — Blag, Feb 19 '18 at 20:34
Ok, seem like Ingres don't have String Aggregate function, that'll not help... How do you want your `what it changed to, and when?` ? Is one result line by access a patient had (with a start/end date) good for you ? — Blag, Feb 19 '18 at 21:08
Please use whatever generic flavor of SQL with which you are most comfortable. I would expect there will be a lack of functionality in our old version of Ingres, but in the past it's been best to cross that bridge when we get there. Updated answers to your two questions above. — Jackson, Feb 20 '18 at 12:31
G Jones created a nice data sample and create table below. I'll copy it into the Updates portion of the question with credit to him. — Jackson, Feb 20 '18 at 12:32
Thanks for the update; could you explain the basic of d1/d8/d8_5 ? I miss a point I think u_u — Blag, Feb 20 '18 at 12:44
Sure, I'll update that quickly too. Check out my note on your query below, hosed by old Ingres. — Jackson, Feb 20 '18 at 12:50

Blag · Answer 1 · 2018-02-19T20:52:51.047

1

You're probably after some select COUNT(DISTINCT d8_5_B.xpostaccessuse) using a group by to show you how many state the patient had.

Provide a data-test and the create table if you want the full request ;)

Edit:

This will probably show you the number of distinct access over the 90 days (I've used INNER JOIN, as it seem your left is useless)

SELECT
    d8.start_date,
    d5.xpostaccessuse,
    d1.access_count
FROM (
    SELECT 
        d8.xpid, 
        MAX(d8.start_date) date_max, 
        COUNT(DISTINCT d5.xpostaccessuse) access_count
    FROM d8
    INNER JOIN d1 
        ON d8.xpid = d1.xpid 
    INNER JOIN d8_5 d5 
        ON d8.xsession_id = d5.xsession
    WHERE 
        d8.start_date > d1.ddiadate 
        AND d8.start_date <= d1.ddiadate + '90 days'
    GROUP BY d8.xpid
) d1 
INNER JOIN d8 
    ON d1.xpid = d8.xpid
    AND d8.start_date = d1.date_max
INNER JOIN d8_5 d5 
    ON d8.xsession_id = d5.xsession

edited Feb 19 '18 at 20:52

answered Feb 19 '18 at 20:30

Blag

5,818
2
22
45

`FROM (SELECT...` is not supported in Ingres 9. Unfortunately our vendor is still on Ingres 9. This functionality was supported starting in Ingres 10, which we hope to upgrade to soon on production. We are on Ingres 10 on test, and your query runs. It outputs one row per date, instead of one row per patient, but that would be easy enough to fix if it weren't for being stuck on Ingres 9. Thanks for the suggestion. Let me know if you come up with anything even more basic. – Jackson Feb 20 '18 at 12:49
1

@Jackson you mean you can't even make FROM sub-query ? o_o gosh, what the hell is this DB... (I can't even find the doc for v9...) . I'll check a bit more this evening. – Blag Feb 20 '18 at 13:03
Yep. Welcome to my hell. Hoping Ingres 10 is in the near future, it has far more functionality. – Jackson Feb 20 '18 at 15:20

score 1 · Answer 2 · answered Feb 20 '18 at 10:30

Since Ingres is mentioned here, one possible approach might be to use a row-producing database procedure to identify which rows aren't changing and avoid returning any you're not interested in. An example is below based on my interpretation of what the tables look like, hope it's of some use.

-- Some sample data.

create table d1
(xpid integer, ddiadate ingresdate);

create table d8
(xpid integer, xsession_id integer, start_date ingresdate);

create table d8_5
(xpid integer, xsession integer, xpostaccessuse integer);

insert into d1 values(1, '2018-01-01');
insert into d1 values(2, '2018-01-01');

insert into d8 values(1, 1, '2018-01-01');
insert into d8 values(1, 2, '2018-01-10');
insert into d8 values(1, 3, '2018-01-20');

insert into d8 values(2, 1, '2018-01-01');
insert into d8 values(2, 2, '2018-01-10');
insert into d8 values(2, 3, '2018-01-20');

insert into d8_5 values(1, 1, 3);
insert into d8_5 values(1, 2, 3);
insert into d8_5 values(1, 3, 3);

insert into d8_5 values(2, 1, 3);
insert into d8_5 values(2, 2, 5);
insert into d8_5 values(2, 3, 3);

-- Query to show patient's access type at each session
-- during the 3 months before becoming chronic.

select d1.xpid, d8.start_date, d8_5.xpostaccessuse
from d1
join d8 on d1.xpid = d8.xpid
join d8_5 on d1.xpid = d8_5.xpid and d8.xsession_id = d8_5.xsession
where d8.start_date <= d1.ddiadate + '90 days'
and d8.start_date >= d1.ddiadate
order by xpid, start_date;

-- Use a row-producing database procedure, based on the
-- above query, to return a row only when data changes.

create procedure rpp
result row r (xpid integer, start_date date, xpostaccessuse integer) =
declare
  xpid = integer;
  sd = ingresdate;
  xp = integer;
  xplast = integer;
  xpidlast = integer;
begin
  xplast = -1; xpidlast = -1;
  for
  select d1.xpid, d8.start_date, d8_5.xpostaccessuse
  into :xpid, :sd, :xp
  from d1
  join d8 on d1.xpid = d8.xpid
  join d8_5 on d1.xpid = d8_5.xpid and d8.xsession_id = d8_5.xsession
  where d8.start_date <= d1.ddiadate + '90 days'
  and d8.start_date >= d1.ddiadate
  order by xpid, d8.start_date desc
  do
    if :xplast != :xp or :xpidlast != :xpid
    then
      return row(:xpid, :sd, :xp);
      xplast = :xp;
      xpidlast = :xpid;
    endif;
  endfor;
end;

-- Query the row-producing procedure.

select * from rpp() order by xpid, start_date;

While I am inexperienced with this, we do have the ability to create and use stored procedures. I'll look into it today, thanks — Jackson, Feb 20 '18 at 12:03

score 0 · Answer 3 · answered Oct 09 '18 at 17:34

In this flavor of SQL, Ingres 9 (pretty outdated and maybe irrelevant to the rest of the internet, but in case it's not...), there are lots of restrictions on where you can use subqueries.

I ended up creating views for subqueries that needed to be placed in "illegal" places, joined the views as tables and got the same effect.

Good luck!

How do I determine if a column value has changed back and forth during a date range?

3 Answers3