-1

I want to be able to trace backwards through a table of movement of materials, equipment used to move it and timings, starting at a specified value in a field and finding that value in another field in different row(s) based on a shared value.

In the table below, I need to be able to trace back, for example, from the EndCode 'M' all the related rows all the way back to StartCode A. I might then want to be able to trace back through all rows from EndCode 'U' back to StartCode 'N'.

In the table, StartCode (material) A and B are moved to become EndCode C. EndCode C then become StartCodeD, is then moved with StartCode E to become EndCode F etc. The orange/blue rows represent groups of material movements.

From this, I want to create a new table/view which brings through the StartedAt time of the next event as a new field named 'EndedAt'. It will look like this:

I've created a recursive query that uses ROW_NUMBER and CTE to bring through the StartedAt as a new field as "EndedAt".

I then tried to use a nested SELECT statement to find all the StartCodes related to EndCode 'M' in an attempt to loopback through the table to search for all the relevant EndCodes in the StartCode column. This only brings through a few of the rows. I treid I repeat the nested statement a few times it brings through different rows.

I need help to pick up all the relevant Start/End Codes needed to create the desired table/view.

Eventually a report will be attached to the table allowing the table to be filterable by a chosen EndCode parameter (e.g. M, U etc.)

    CREATE TABLE MyTable (
  `StartCode` VARCHAR(1),
  `StartedAt` DATETIME,
  `EndCode` VARCHAR(1)
);

INSERT INTO MyTable (`StartCode`, `StartedAt`, `EndCode`)
VALUES
  ('A', '01/01/2019 01:00', 'C'),
  ('B', '01/01/2019 02:15', 'C'),
  ('C', '01/01/2019 03:00', 'F'),
  ('D', '01/01/2019 03:19', 'F'),
  ('E', '01/01/2019 04:00', 'F'),
  ('F', '01/01/2019 04:14', 'G'),
  ('G', '01/01/2019 05:00', 'J'),
  ('H', '01/01/2019 05:37', 'J'),
  ('I', '01/01/2019 05:45', 'J'),
  ('J', '01/01/2019 06:00', 'L'),
  ('K', '01/01/2019 06:09', 'L'),
  ('L', '01/01/2019 07:00', 'M'),
  ('N', '01/01/2019 09:20', 'P'),
  ('O', '01/01/2019 09:37', 'P'),
  ('P', '01/01/2019 09:45', 'Q'),
  ('Q', '01/01/2019 11:00', 'T'),
  ('R', '01/01/2019 11:10', 'T'),
  ('S', '01/01/2019 11:47', 'T'),
  ('T', '01/01/2019 11:58', 'U');
Jimbo
  • 67
  • 9
  • 2
    It would be better to provide your sample data as `formatted text` or, even better, in a sql fiddle. This makes it easier for us to help you. – Tony Jan 09 '19 at 12:48
  • 2
    The EndedAt column is a simple and direct join (no recursive nor CTE required). For the M to A, if you can post a full DML and DDL example I can help you out. Also remove the bucket columns as they arent important to your problem. – EzLo Jan 09 '19 at 12:50
  • 1
    Please specify the exact output you want from the hierarchy relationships. – EzLo Jan 09 '19 at 12:56
  • OK, used SL Fiddle, see edit. Thanks :) – Jimbo Jan 09 '19 at 13:50

1 Answers1

1

The EndedAt is a simple join:

SELECT
    S.StartCode,
    S.StartedAt,
    S.EndCode,
    E.StartedAt AS EndedAt
FROM
    MyTable AS S
    LEFT JOIN MyTable AS E ON S.EndCode = E.StartCode

Result:

StartCode   StartedAt                   EndCode     EndedAt
A           2019-01-01 01:00:00.000     C           2019-01-01 03:00:00.000
B           2019-01-01 02:15:00.000     C           2019-01-01 03:00:00.000
C           2019-01-01 03:00:00.000     F           2019-01-01 04:14:00.000
D           2019-01-01 03:19:00.000     F           2019-01-01 04:14:00.000
E           2019-01-01 04:00:00.000     F           2019-01-01 04:14:00.000
F           2019-01-01 04:14:00.000     G           2019-01-01 05:00:00.000
G           2019-01-01 05:00:00.000     J           2019-01-01 06:00:00.000
H           2019-01-01 05:37:00.000     J           2019-01-01 06:00:00.000
I           2019-01-01 05:45:00.000     J           2019-01-01 06:00:00.000
J           2019-01-01 06:00:00.000     L           2019-01-01 07:00:00.000
K           2019-01-01 06:09:00.000     L           2019-01-01 07:00:00.000
L           2019-01-01 07:00:00.000     M           NULL
N           2019-01-01 09:20:00.000     P           2019-01-01 09:45:00.000
O           2019-01-01 09:37:00.000     P           2019-01-01 09:45:00.000
P           2019-01-01 09:45:00.000     Q           2019-01-01 11:00:00.000
Q           2019-01-01 11:00:00.000     T           2019-01-01 11:58:00.000
R           2019-01-01 11:10:00.000     T           2019-01-01 11:58:00.000
S           2019-01-01 11:47:00.000     T           2019-01-01 11:58:00.000
T           2019-01-01 11:58:00.000     U           NULL

You can display the hierarchies with the following (recursive CTE with bottoms up approach in this case). Make sure that you have no loops in your data first.

Edit: If you want to check for any step in the hierarchy and upwards, the anchor needs to be any code (not just the last M or U), so I removed the WHERE in the anchor.

DECLARE @EndCodeFilter CHAR(1) = 'J'

;WITH RecursiveCodes AS
(
    -- Anchor
    SELECT
        LastCode = M.EndCode,
        CurrentCode = M.StartCode,
        PreviousCode = M.EndCode,
        RecursionLevel = 1,
        RecursionPath = CONVERT(NVARCHAR(MAX), M.EndCode + '->' + M.StartCode),
        CurrentStartAt = M.StartedAt
    FROM
        MyTable AS M

    UNION ALL

    -- Recursion: link related codes
    SELECT
        LastCode = R.LastCode,
        CurrentCode = M.StartCode,
        PreviousCode = M.EndCode,
        RecursionLevel = R.RecursionLevel + 1,
        RecursionPath = R.RecursionPath + '->' + M.StartCode,
        CurrentStartAt = M.StartedAt
    FROM
        RecursiveCodes AS R
        INNER JOIN MyTable AS M ON R.CurrentCode = M.EndCode
)
SELECT
    R.CurrentCode,
    R.CurrentStartAt,
    R.LastCode,
    EndedAt = E.StartedAt,
    R.PreviousCode,
    R.RecursionLevel,
    R.RecursionPath
FROM
    RecursiveCodes AS R
    LEFT JOIN MyTable AS E ON R.LastCode = E.StartCode
WHERE
    R.LastCode = @EndCodeFilter
ORDER BY
    R.CurrentCode,
    R.LastCode
OPTION
    (MAXRECURSION 0)

Result:

CurrentCode CurrentStartAt              LastCode    EndedAt                     PreviousCode    RecursionLevel  RecursionPath
A           2019-01-01 01:00:00.000     J           2019-01-01 06:00:00.000     C               4               J->G->F->C->A
B           2019-01-01 02:15:00.000     J           2019-01-01 06:00:00.000     C               4               J->G->F->C->B
C           2019-01-01 03:00:00.000     J           2019-01-01 06:00:00.000     F               3               J->G->F->C
D           2019-01-01 03:19:00.000     J           2019-01-01 06:00:00.000     F               3               J->G->F->D
E           2019-01-01 04:00:00.000     J           2019-01-01 06:00:00.000     F               3               J->G->F->E
F           2019-01-01 04:14:00.000     J           2019-01-01 06:00:00.000     G               2               J->G->F
G           2019-01-01 05:00:00.000     J           2019-01-01 06:00:00.000     J               1               J->G
H           2019-01-01 05:37:00.000     J           2019-01-01 06:00:00.000     J               1               J->H
I           2019-01-01 05:45:00.000     J           2019-01-01 06:00:00.000     J               1               J->I
EzLo
  • 13,780
  • 10
  • 33
  • 38
  • I'll give that a try., thanks, Ezlo. The only issue I can see is that if, based on my example in initial post, I just want, for example, trace back all the Start/EndCodes that are linked to EndCode M and EndCode and display that information as laid out in your first set of results. The only way is by following the Stat/EndCodes from letter M back to A, which I can't do with you (much appreciated) help? – Jimbo Jan 09 '19 at 19:24
  • The hierarchies code works, had to change the CONVERT part to NVARCHAR not VARCHAR and It worked. I just need to be able to display a table like in your first output which I can just say "show me only those rows that are related to EndCode M". I want would use a @parameter using the EndCode field as the values to choose from. – Jimbo Jan 09 '19 at 20:07
  • @Jimbo I've edited the answer to query by parameter. – EzLo Jan 10 '19 at 08:03
  • Thanks for all the help, @EnZo. The problem is the EndedAt of each row should be the StartedAt value of the next row, at present it is bringing through only the last StartedAt date for each row. Also, if you change the parameter letter J to letter M or U, the EndedAt values are all Null. – Jimbo Jan 10 '19 at 09:46
  • @Jimbo I believe you can edit the query to suit your needs now. Also M and U don't have time stamps so you will never know their end date. – EzLo Jan 10 '19 at 09:51
  • I've modified it so that the correct date is shown where I wanted it. Now I have pointed it towards my real data it is just very slow executing, over half an hour so far. Theres 11000 rows in my table.. – Jimbo Jan 10 '19 at 15:10
  • @Jimbo create an index on the codes columns if you dont have one yet. `CREATE NONCLUSTERED INDEX NCI_Codes ON TableName (EndCode, StartCode)` – EzLo Jan 10 '19 at 15:27
  • I;ve found the problem. It doesn't stop running as there are duplicates in both the Start and End Code columns. In the table I do have a unique ID for each row, can I do something with that? If not, is there anyway of doing this with duplicates? – Jimbo Jan 10 '19 at 21:30
  • now we know there are duplicates in codes, can I alter you rcode easily to accommodate these? – Jimbo Jan 11 '19 at 10:58