let's suppose that we have a huge mongodb collection (around 60,000,000 records) and we want to scroll over it without fetching of course a huge amount in memory. For Hibernate orm, it's possible as there is the scrolling api and criteria queries but what could be a solution for pagination in OGM(already spend a lot of time but I couldn't find something)? I'm indexing in batches of 50,000 the objects using the setFirstResult-setMaxResult Api. These are the time in seconds of getting the batches from a query like
em.createNativeQuery(query, Entity.class).setFirstResult(i).setMaxResults(batchSize).getResultList()
results.stream().forEach(res -> fullTextEntityManager.index(res));
by increasing i in each iteration with like i+=batchSize;
I already tried using OgmMassIndexer but I need to be able to start-stop, index specific ranges so I prefer to do it manually.
As it's obvious and logical the time to find the firstResult in each iteration is increasing. Here I have time in secs to find the next batch of 50000 from the beginning from 4 million(setFirstResult(4000000).setMaxResult(50000)):
eg. to go to 4000000 it took 17 seconds etc. to go to 4050000 it took 15 seconds etc. to go to 4100000 it took 12 seconds etc. but later this number increases a lot:
Found: 17 Found: 15 Found: 12 Found: 13 Found: 13 Found: 13 Found: 15 Found: 16 Found: 16 Found: 17 Found: 18 Found: 18 Found: 19 Found: 19 Found: 20 Found: 20 Found: 21 Found: 21 Found: 22 Found: 21 Found: 22 Found: 23 Found: 23 Found: 23 Found: 24 Found: 24 Found: 25 Found: 25 Found: 26 Found: 26 Found: 27 Found: 28 Found: 27 Found: 29 Found: 29 Found: 30 Found: 31 Found: 32 Found: 33 Found: 30 Found: 33 Found: 32 Found: 34 Found: 34 Found: 35 Found: 35 Found: 38 Found: 36 Found: 38 Found: 36 Found: 41 Found: 41 Found: 39 Found: 41 Found: 41 Found: 40 Found: 42 Found: 43 Found: 42 Found: 44 Found: 44 Found: 45 Found: 47 Found: 45 Found: 44 Found: 44 Found: 47 Found: 44 Found: 47 Found: 47 Found: 50 Found: 52 Found: 93
Any options for scrolling mongodb using ogm cursor or something to fetch the objects in session and index them efficiently? I mean this is not viable even for an app that want to paginate that big amount of data using OGM without Hibernate Search, so I suppose there is a solution I don't see.
Thanks a lot.
Hibernate OGM 5.3.1, Hibernate Search 5.9.0 using ElasticSearch