3

I am trying to implement pagination using nextPageToken.

I have table:

CREATE TABLE IF NOT EXISTS categories
(
    id        BIGINT PRIMARY KEY,
    name      VARCHAR(30) NOT NULL,
    parent_id BIGINT REFERENCES categories (id)
);

so I have entity Category:

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    private Long id;
    @Column(name = "name")
    private String name;
    @OneToOne
    @JoinColumn(name = "parent_id", referencedColumnName = "id")
    private Category category;

I dont really understand what I have to do next. client requests token(that keeps what?)

Assume I have controller:

@GetMapping
    public ResponseEntity<CategoriesTokenResponse> getCategories(
            @RequestParam String nextPageToken
    ) {
        return ResponseEntity.ok(categoryService.getCategories(nextPageToken));
    }

Service:

public CategoriesTokenResponse getCategories(String nextPageToken) {
        return new CategoriesTokenResponse(categoryDtoList, "myToken");
    }
@Data
@AllArgsConstructor
public class CategoriesTokenResponse {
    private final List<CategoryDto> categories;
    private final String token;
}

How I have to implement sql query for that? And how I have to generate nextPagetoken for each id?

SELECT * FROM categories WHERE parent_id = what?
AND max(category id from previous page = token?)
ORDER BY id LIMIT 20;
andrew17
  • 851
  • 2
  • 10
  • 25
  • Have you thought about using spring boots built in pagination for REST or JPA? – Jason Mar 06 '20 at 21:33
  • one smart guy said, that its bad implemenation because of OFFSET and something else, that its bad idea, so I try to implement custom pagination – andrew17 Mar 06 '20 at 21:37
  • I would argue, if there is a battle proven solution that you could use, use it unless you have proof it doesn't work for you, i.e. you have profiled it to find that this implementation is, what kills your performance. Otherwise you'll most likely end up wasting money on micro optimization, and propably can't even show that you have optimized anything at all, since comparison measurements are missing. – derM - not here for BOT dreams Mar 06 '20 at 21:49
  • Don't listen to everything you hear. Even me, right now. – Jason Mar 06 '20 at 21:50
  • @derM So PagingAndSortingRepository is always used in production? Do I need there field parent_id as I have now? – andrew17 Mar 06 '20 at 21:53
  • @Jason https://use-the-index-luke.com/no-offset he gave me this topic and this https://developers.marketo.com/rest-api/paging-tokens/ – andrew17 Mar 06 '20 at 21:55
  • I don't know about always, but I am certain, often. But if you read the first article, there is a pretty clear sql listing, how he would implement it - using a total order on the key, and querying only keys larger than the last seen key. If you are not sorting for anything more thant the ID, use `long` as token type. `WHERE whatever you like`, `AND id > lastSeenId` – derM - not here for BOT dreams Mar 06 '20 at 22:09
  • lastseenId mmm amazing, how have to interprete it in real code?.......... @derM – andrew17 Mar 06 '20 at 22:42

1 Answers1

6

First, you need to understand what you are working with here. Every implementation has some sort of limitation or inefficiency. For example, using page tokens like that is only good for infinite scroll pages. You can't jump to any specific page. So if my browser crashes and I'm on page 100, I have to scroll through 100 pages AGAIN. It is faster for massive data sets for sure, but does that matter if you need access to all pages? Or if you limit the return to begin with? Such as only getting the first 30 pages?

Basically decide this: Do you only care about the first few pages because search/sort is always in use? (like a user never using more than the first 1-5 pages of google) and is that data set large? Then great use-case. Will a user select "all items in the last 6 months" and actually need all of them or is the sort/search weak? Or, will you return all pages and not limit max return of 30 pages? Or, is development speed more important than a 0.1-3 second (depends on data size) speed increase? Then go with built in JPA Page objects.

I have used Page objects on 700k records with less than a second speed change compared to 70k records. Based on that, I don't see removing offset adding a ton of value unless you plan for a huge data set. I just tested a new system I'm making with pageable, it returned 10 items on page 1 in 84 milliseconds with no page limiter for 27k records on a vpn into my work network from my house. A table with over 500k records took 131 milliseconds That's pretty fast. Want to make it faster? Force a total max return of 30 pages and a max of 100 results per page, because normally, they don't need all data in that table. They want something else? refine the search. The speed difference is less than a second between this and the seek/key stype paging. This is assuming a normal SQL database too. NoSQL is a bit different here. Baeldung has a ton of articles on jpa paging like the following: https://www.baeldung.com/rest-api-pagination-in-spring

JPA Paging should take no more than 30 minutes to learn and implement, it's extremely easy and comes stock on JPA repositories. I strongly suggest using that over the seek/key style paging as you likely aren't building a system like google's or facebook's.

If you absolutely want to go with the seek/key style paging there's a good informational page here: https://blog.jooq.org/2013/10/26/faster-sql-paging-with-jooq-using-the-seek-method/

In general, what you are looking for is using JOOQ with spring. Implementation example here: https://docs.spring.io/spring-boot/docs/1.3.5.RELEASE/reference/html/boot-features-jooq.html

Basically, create a DSL context:

private final DSLContext DSL;

@Autowired
public JooqExample(DSLContext dslContext) {
    this.DSL= dslContext;
}

Then use it like so:

DSL.using(configuration)
   .select(PLAYERS.PLAYER_ID,
           PLAYERS.FIRST_NAME,
           PLAYERS.LAST_NAME,
           PLAYERS.SCORE)
   .from(PLAYERS)
   .where(PLAYERS.GAME_ID.eq(42))
   .orderBy(PLAYERS.SCORE.desc(),
            PLAYERS.PLAYER_ID.asc())
   .seek(949, 15) // (!)
   .limit(10)
   .fetch();

Instead of explicitly phrasing the seek predicate, just pass the last record from the previous query, and jOOQ will see that all records before and including this record are skipped, given the ORDER BY clause.

Nox
  • 1,358
  • 1
  • 13
  • 25
  • Thank you, it is very useful answer! – andrew17 Mar 24 '20 at 17:28
  • 2
    I did a lot of research on the subject when I made my first api because it was for a system that would get 1.6m records per year. Also, I suggest reading about the link header. I hate containers, so instead of having a container that holds the object data, you return the first,last,next,previous in a link header and the body can hold just the object data. I also don't use hateoas unless it's a big system as well. It's kind of useless unless you have a ton of consumers or a constant changing system. – Nox Mar 25 '20 at 16:35