7

I have a big issue when using multiple threads with spring-redis-data and it's so easy to reproduce that I think I've been missing something trivial.

Straight to the point

If I query a CrudRepository while doing save operations, it sometimes (up to 60%) doesn't find the record on Redis.

The environment

The code

Despite the full code could be found in the link above, this are the main components:

CrudRepository

@Repository
public interface MyEntityRepository extends CrudRepository<MyEntity, Integer> {

}

Entity

@RedisHash("my-entity")
public class MyEntity implements Serializable {

    @Id
    private int id1;

    private double attribute1;
    private String attribute2;
    private String attribute3;

Controller

    @GetMapping( "/my-endpoint")
    public ResponseEntity<?> myEndpoint () {

        MyEntity myEntity = new MyEntity();
        myEntity.setAttribute1(0.7);
        myEntity.setAttribute2("attr2");
        myEntity.setAttribute3("attr3");
        myEntity.setId1(1);

        myEntityRepository.save(myEntity);//create it in redis

        logger.info("STARTED");

        for (int i = 0; i < 100; i++){
            new Thread(){
                @Override
                public void run() {
                    super.run();

                    myEntity.setAttribute1(Math.random());

                    myEntityRepository.save(myEntity); //updating the entity

                    Optional<MyEntity> optionalMyEntity = myEntityRepository.findById(1);
                    if (optionalMyEntity.isPresent()) {
                        logger.info("found");
                    }else{
                        logger.warning("NOT FOUND");
                    }
                }
            }.start();

        }

        return ResponseEntity.noContent().build();
    }

The result

2020-05-26 07:52:53.769  INFO 30655 --- [nio-8080-exec-2] my-controller-logger                     : STARTED
2020-05-26 07:52:53.795  INFO 30655 --- [     Thread-168] my-controller-logger                     : found
2020-05-26 07:52:53.798  WARN 30655 --- [     Thread-174] my-controller-logger                     : NOT FOUND
2020-05-26 07:52:53.798  WARN 30655 --- [     Thread-173] my-controller-logger                     : NOT FOUND
2020-05-26 07:52:53.806  INFO 30655 --- [     Thread-170] my-controller-logger                     : found
2020-05-26 07:52:53.806  WARN 30655 --- [     Thread-172] my-controller-logger                     : NOT FOUND
2020-05-26 07:52:53.812  WARN 30655 --- [     Thread-175] my-controller-logger                     : NOT FOUND
2020-05-26 07:52:53.814  WARN 30655 --- [     Thread-176] my-controller-logger                     : NOT FOUND
2020-05-26 07:52:53.819  WARN 30655 --- [     Thread-169] my-controller-logger                     : NOT FOUND
2020-05-26 07:52:53.826  INFO 30655 --- [     Thread-171] my-controller-logger                     : found
2020-05-26 07:52:53.829  INFO 30655 --- [     Thread-177] my-controller-logger                     : found

So simply with 10 threads, 6 of them are not finding the result in db.

Replacing with spring data redis

As mentioned here replacing in redis with spring data redis contains at least, 9 operations.

First conclusion

So, as to replace a value in redis, it has to remove the hash, the indexes and then adding the new hash and the new indexes again, maybe a thread is in the middle of doing this operations while other thread tries to find the value by index and this index has not beeing added yet.

Second conclusion

I think it's nearly impossible that spring data with data-redis has such a bug, so I'm wondering what I'm not understanding of data-redis or redis. As redis has concurrency I think something different may be happening, but with the provided example it seems like that...

Thank you in advance to all of you

2 Answers2

2

This ticket raises the same problem.

The behavior was chosen deliberately to avoid lingering hash entries. Deleting the hash ensures a consistent state and avoids additional entries that should no longer be part of the hash.
Redis Repository operations are not atomic.

So it is intended to be not atomic.
And suggested in the ticket, the solution will be using PartialUpdate.

Below is a snippet for example

    @Autowired
    private RedisKeyValueTemplate redisKVTemplate;
    ...
    // id is the @Id value of the entity
    private void update(Integer id) {
        PartialUpdate update = new PartialUpdate<MyEntity>(id, MyEntity.class)
                .set("attribute1", Math.random());
        redisKVTemplate.update(update);
    }

References:
Update entity in redis with spring-data-redis

samabcde
  • 6,988
  • 2
  • 25
  • 41
0

You have one MyEntity instance:

MyEntity myEntity = createEntity();

Then you have started 10 threads, all of which are updating that one object myEntity.set....

Then when you save it as in myEntityRepository.save(myEntity);, it is impossible to tell what value is being saved, as all threads are competing to insert their own value.

When you call myEntityRepository.save, it might be saving (again) a value that was written to myEntity by another thread. So this thread never got a chance to write it's value to the repo, hence you won't find it!

I'm not across @RedisHash so I may be wrong, but I think you need to create a new entity object each time you want to save a record.

Another non-related issue with your code is unbounded thread creation (unless you don't plan to use it in production).

Kartik
  • 7,677
  • 4
  • 28
  • 50
  • 1
    Sorry, I've updated my code in order to simplfiy it. But the structure and the issue remains the same. When the repository attempts to save an instance, it checks the @Id attribute so it knows whether to insert or update and which row it should be updated in that case. So yes, the threads are competing to updating the value, but if each opeartion was atomic, it should always find data. In fact, the thread-7 could find the data with the modifications done with thread-9 for instance but they should always find data. – Héctor Berlanga May 26 '20 at 07:49
  • @HéctorBerlanga Try synchronizing the contents of your `run` method to make the operation atomic. – Kartik May 27 '20 at 00:00