0

I have a REST API to calculate something upon a request, and if the same request is made again, return the result from the cache, which consist of documents saved in MongoDB. To know if two request is the same, I am hashing some relevant fields in the request. But when same request is made in a quick succession, duplicate documents occur in MongoDB, which later results in "IncorrectResultSizeDataAccessException" when I try to read them.

To solve it I tried to synchronize on hash value in following controller method (tried to cut out non relevant parts):

@PostMapping(
        path = "/{myPath}",
        consumes = {MediaType.APPLICATION_JSON_UTF8_VALUE},
        produces = {MediaType.APPLICATION_JSON_UTF8_VALUE})
@Async("asyncExecutor")
public CompletableFuture<ResponseEntity<?>> retrieveAndCache( ... a,b,c,d various request parameters) {
    
    //perform some validations on request...
    
    //hash relevant equest parameters
    
    int hash = Objects.hash(a, b, c, d);
    
    synchronized (Integer.toString(hash).intern()) {

        Optional<Result> resultOpt = cacheService.findByHash(hash);
        
        if (resultOpt.isPresent()) {
            return CompletableFuture.completedFuture(ResponseEntity.status(HttpStatus.OK).body(opt.get().getResult()));
        } else {
            Result result = ...//perform requests to external services and do some calculations...
            cacheService.save(result);
            
            return CompletableFuture.completedFuture(ResponseEntity.status(HttpStatus.OK).body(result));
        }
        
    }
}



//cacheService methods
@Transactional
public Optional<Result> findByHash(int hash) {
    return repository.findByHash(hash); //this is the part that throws the error
}

I am sure that no hash collision is occuring, its just when the same request is performed in a quick succession duplicate records occur. To my understanding, it shouldn't occur as long as I have only 1 running instance of my spring boot application. Do you see any other reason than there are multiple instances running in production?

uylmz
  • 1,480
  • 2
  • 23
  • 44

1 Answers1

1

You should check the settings of your MongoDB client.

If one thread calls the cacheService.save(result) method, and after that method returns, releases the lock, then another thread calls cacheService.findByHash(hash), it's still possible that it will not find the record that you just saved.

It's possible that e.g. the save method returns as soon as the saved object is in the transaction log, but not fully processed yet. Or the save is processed on the primary node, but the findByHash is executed on the secondary node, where it's not replicated yet.

You could use WriteConcern.MAJORITY, but I'm not 100% sure if it covers everything.

Even better is to let MongoDB do the locking by using findAndModify with FindAndModifyOptions.upsert(true), and forget about the lock in your java code.

GeertPt
  • 16,398
  • 2
  • 37
  • 61
  • I'm using spring data MongoRepository methods, instead of MongoTemplate. I don't know if those options affect their behavior too, I thought mongo repository save method would be equivalent to findAndModify – uylmz Sep 01 '20 at 15:47
  • Save is somewhat equivalent to findAndModify, if that findAndModify used the primary key of the table. But from your example, I assumed the hash is not the primary key? Even if you would use the hash as an assigned key, you still need to use the correct ReadPreference/WriteConcern to make the locking work. – GeertPt Sep 02 '20 at 07:59