The application is syncing data records between devices via an online Mongo DB collection. Multiple devices can send batches of new or modified records to the server Mongo collection at any time. Devices get all record updates for them that they don't already have, by requesting records added or modified since their last get request.
Approach 1 - was to add a Date object field (called stored1) to the records before saving to MongoDb. When a device requests records , mongoDb paging is used to skip entries up to the current page, and then limit to 1000. Now that the data set is large, each page request is taking a long time, and mongo hit a memory error.
https://docs.mongodb.com/manual/reference/limits/#operations
Setting allowDiskUse(true) as shown in the posted code in my current configuration isn't fixing the memory error for some reason. If that can be fixed, it still wouldn't be a long term solution as the query times with the paging are already too long.
Approach 2:
What is the best way for pagination on mongodb using java
https://arpitbhayani.me/blogs/benchmark-and-compare-pagination-approach-in-mongodb
The 2nd approach considered is to change from Mongo paging skipping returned records, to just asking for stored time > largest stored time last received, until the number of records in a return is less than the limit. This requires the stored timestamp to be unique between all records matching the query, or it could miss records or get duplicate records etc. In the example code, using the stored2 field, there's still a chance of duplicate timestamps, even if the probability is low.
Mongo has a BSON timestamp that guarantees unique values per collection, but I don't see a way to use it with document save(), or query on it in Spring Boot. It would need to be set on each record newly inserted, or replaced, or updated. https://docs.mongodb.com/manual/reference/bson-types/#timestamps
Any suggestions on how to do this?
@Getter
@Setter
public abstract class DataModel {
private Map<String, Object> data;
@Id // maps this field name to the database _id field, automatically indexed
private String uid;
/** Time this entry is written to the db (new or modified), to support querying for changes since last query */
private Date stored1; //APPROCAH 1
private long stored2; //APPROACH 2
}
/** SpringBoot+MongoDb database interface implementation */
@Component
@Scope("prototype")
public class SpringDb implements DbInterface {
@Autowired
public MongoTemplate db; // the database
@Override
public boolean set(Collection<?> newRecords, Collection<?> updatedRecords) {
// get current time for this set
Date date = new Date();
int randomOffset = ThreadLocalRandom.current().nextInt(0, 500000);
long startingNanoSeconds = Instant.now().getEpochSecond() * 1000000000L + instant.getNano() + randomOffset;
int ns = 0;
if (updatedRecords != null && updatedRecords.size() > 0) {
for (Object entry : updatedRecords) {
entry.setStored1(date); //APPROACH 1
entry.setStored2(startingNs + ns++); //APPROCH 2
db.save(entry, repoName);
}
}
// for new documents only
if (newRecords != null && newRecords.size() > 0) {
for (DataModel entry : newRecords) {
entry.setStored1(date); //APPROACH 1
entry.setStored2(startingNs + ns++); // APPROACH 2
}
//multi record insert
db.insert(newRecords, repoName);
}
return true;
}
@Override
public List<DataModel> get(Map<String, String> params, int maxResults, int page, String sortParameter) {
// generate query
Query query = buildQuery(params);
//APPROACH 1
// do a paged query
Pageable pageable = PageRequest.of(page, maxResults, Direction.ASC, sortParameter);
List<T> queryResults = db.find(query.allowDiskUse(true).with(pageable), DataModel.class, repoName); //allowDiskUse(true) not working, still get memory error
// count total results
Page<T> pageQuery = PageableExecutionUtils.getPage(queryResults, pageable,
() -> db.count(Query.of(query).limit(-1).skip(-1), clazz, getRepoName(clazz)));
// return the query results
queryResults = pageQuery.getContent();
//APPROACH 2
List<T> queryResults = db.find(query.allowDiskUse(true), DataModel.class, repoName);
return queryResults;
}
@Override
public boolean update(Map<String, String> params, Map<String, Object> data) {
// generate query
Query query = buildQuery(params);
//This applies the same changes to every entry
Update update = new Update();
for (Map.Entry<String, Object> entry : data.entrySet()) {
update.set(entry.getKey(), entry.getValue());
}
db.updateMulti(query, update, DataModel.class, repoName);
return true;
}
private Query buildQuery(Map<String, String> params) {
//...
}
}