47

I noticed some repeating rows in a paginated recordset.

When I run this query:

SELECT "students".* 
FROM "students" 
ORDER BY "students"."status" asc 
LIMIT 3 OFFSET 0

I get:

    | id | name  | status |
    | 1  | foo   | active |
    | 12 | alice | active |
    | 4  | bob   | active |

Next query:

SELECT "students".* 
FROM "students" 
ORDER BY "students"."status" asc 
LIMIT 3 OFFSET 3

I get:

    | id | name  | status |
    | 1  | foo   | active |
    | 6  | cindy | active |
    | 2  | dylan | active |

Why does "foo" appear in both queries?

Old Pro
  • 24,624
  • 7
  • 58
  • 106
keewooi
  • 952
  • 2
  • 9
  • 10

3 Answers3

86

Why does "foo" appear in both queries?

Because all rows that are returned have the same value for the status column. In that case the database is free to return the rows in any order it wants.

If you want a reproducable ordering you need to add a second column to your order by statement to make it consistent. E.g. the ID column:

SELECT students.* 
FROM students 
ORDER BY students.status asc, 
         students.id asc

If two rows have the same value for the status column, they will be sorted by the id.

  • 2
    Thanks for the answer! Is this PostgreSQL only? I can't reproduce this kind of behavior in MySQL. – keewooi Nov 27 '12 at 09:40
  • 1
    @amazoom: I don't really know MySQL that well, but the database is free to return the rows in any order it seems fit. My guess(!!) is that MySQL uses the clustered index to return the rows in case of identical values and as the clustered index basically sorts the rows in a table this leads to the result you see. PostgreSQL will return them in the order they are retrieved. But you should **not** rely on that ordering in MySQL either. An index scan, a join or other things *can* change that. –  Nov 27 '12 at 10:02
  • For future viewer : https://github.com/cakephp/cakephp/issues/1827. It is an issue of postgresql. – Badman Jun 04 '18 at 14:18
  • 1
    @Badman: that's not an "issue" that's well documented behaviour. Anyone who relies on a specific sort order if no `order by` is specified puts a bug into her/his software. –  Jun 04 '18 at 14:22
  • @ a_horse_with_no_name I tried it to order by primary key. It is giving result while I gave a large offset while that no of data doesn't exist . – Badman Jun 04 '18 at 14:54
29

For more details from PostgreSQL documentation (http://www.postgresql.org/docs/8.3/static/queries-limit.html) :

When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query's rows. You might be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? The ordering is unknown, unless you specified ORDER BY.

The query optimizer takes LIMIT into account when generating a query plan, so you are very likely to get different plans (yielding different row orders) depending on what you give for LIMIT and OFFSET. Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.

Ahmed MANSOUR
  • 2,369
  • 2
  • 27
  • 35
1
select * from(
    Select "students".* 
    from "students" 
    order by "students"."status" asc 
    limit 6
) as temp limit 3 offset 0;
select * from(
    Select "students".* 
    from "students" 
    order by "students"."status" asc 
    limit 6
) as temp limit 3 offset 3;

where 6 is the total number of records that is under examination.

Otter
  • 1,086
  • 7
  • 18