37

I want to know how to retrieve results in a SQL query doing some logic comparison with the next or previous rows. I'm using PostgreSQL.

Example
Supposing I have a table in my database with two attributes (ordered position and random numbers), I want to retrieve the odd numbers that are between even numbers. How can I do this?

The real usage
I want to find words that are between two another words which have the category NAME (and the word is not a name). The ordering is provided by sentence and position.

Edit I want to know if the Window function of PostgreSQL are best solution for this kind of problem than doing queries. I heard about them, but never used.

Renato Dinhani
  • 35,057
  • 55
  • 139
  • 199

3 Answers3

55

This is my solution using WINDOW functions. I used the lag and lead functions. Both returns a value from a column from a row in offset from the current row. lag goes back and lead goes next in the offset.

SELECT tokcat.text
FROM (
    SELECT text, category, chartype, lag(category,1) OVER w as previousCategory, lead(category,1) OVER w as nextCategory
    FROM token t, textBlockHasToken tb
    WHERE tb.tokenId = t.id
    WINDOW w AS (
        PARTITION BY textBlockId, sentence
        ORDER BY textBlockId, sentence, position
    )
) tokcat
WHERE 'NAME' = ANY(previousCategory)
AND 'NAME' = ANY(nextCategory)
AND 'NAME' <> ANY(category)

Simplified version:

SELECT text
FROM (
    SELECT text
          ,category 
          ,lag(category) OVER w as previous_cat
          ,lead(category) OVER w as next_cat
    FROM   token t
    JOIN   textblockhastoken tb ON tb.tokenid = t.id
    WINDOW w AS (PARTITION BY textblockid, sentence ORDER BY position)
    ) tokcat
WHERE  category <> 'NAME'
AND    previous_cat = 'NAME'
AND    next_cat = 'NAME';

Major points

  • = ANY() is not needed, the window function returns a single value
  • some redundant fields in the subquery
  • no need to order by columns, that you PARTITION BY - the ORDER BY applies within partitions
  • Don't use mixed case identifiers without quoting, it only leads to confusion. (Better yet: don't use mixed case identifiers in PostgreSQL ever)
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Renato Dinhani
  • 35,057
  • 55
  • 139
  • 199
  • +1 Cool solution! I edited in a simplified version. Delete it if you don't like it. Can you run tests with the three queries which perform faster and add it to your answer? Would be interesting ... – Erwin Brandstetter Nov 02 '11 at 12:09
  • @ErwinBrandstetter Only one thing: the ANY is necessary because category is an array, and not a single value. – Renato Dinhani Nov 02 '11 at 15:31
  • @ErwinBrandstetter I will try to follow the others tips too. I don't have much practice with real usage of SQL yet. The others fields are that the real query I'm doing is a bit more complex with more filters. I only posted the main goal that is get the word between names. – Renato Dinhani Nov 02 '11 at 15:41
25

You can find the best solution in this address:

http://blog.sqlauthority.com/2013/09/25/sql-server-how-to-access-the-previous-row-and-next-row-value-in-select-statement-part-4/

Query 1 for SQL Server 2012 and later version:

SELECT
LAG(p.FirstName) OVER(ORDER BY p.BusinessEntityID) PreviousValue,
    p.FirstName,
    LEAD(p.FirstName) OVER(ORDER BY p.BusinessEntityID) NextValue
FROM Person.Person p
GO

Query 2 for SQL Server 2005+ and later version:

WITH CTE AS(
    SELECT rownum = ROW_NUMBER() OVER(ORDER BY p.BusinessEntityID),
    p.FirstName FROM Person.Person p
)
SELECT
prev.FirstName PreviousValue,
    CTE.FirstName,
    nex.FirstName NextValue
FROM CTE
LEFT JOIN CTE prev ON prev.rownum = CTE.rownum - 1
LEFT JOIN CTE nex ON nex.rownum = CTE.rownum + 1
GO
mnshahab
  • 770
  • 7
  • 16
6

This should work:

SELECT w1.word AS word_before, w.word, w2.word AS word_after
FROM   word w
JOIN   word w1 USING (sentence)
JOIN   word w2 USING (sentence)
WHERE  w.category <> 'name'
AND    w1.pos = (w.pos - 1)
AND    w1.category = 'name'
AND    w2.pos = (w.pos + 1)
AND    w2.category = 'name'
  • Use two self-joins
  • All words must be in the same sentence (?) and in order.
  • Word before and word after have to be of category 'name'. Word itself not 'name'
  • This assumes that category IS NOT NULL

To answer your additional question: no, a window function would not be particularly useful in this case, self-join is the magic word here.

Edit:
I stand corrected. Renato demonstrates a cool solution with the window functions lag() and lead().
Note the subtle differences:

  • the self joins operate on absolute values: if the row with pos -1 is missing, then the row with pos does not qualify.
  • Renatos version with lag() and lead() operates on the relative position of rows created by ORDER BY.

In many cases (like probably in the one at hand?) both versions lead to identical results. With gaps in the id space there will be different results.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Please, take a look at my edit. I don't tried this SQL yet, but I think it can solve. The problem is I will have more complicated things to do after this. Do you know if Window functions are more appropriated for this? I never used they. – Renato Dinhani Nov 02 '11 at 02:08
  • Take a look at my answer. I used WINDOW functions. I don't know if is better or worse, but I liked, and now, appears easy to use. – Renato Dinhani Nov 02 '11 at 04:17