0

Working in a very simple use case challenge( just something self inflicted)
Use case: A collection, that just writes data to it.
The caveat is : once it reaches a certain size (eg:25 size), the data currently in it should be flushed out( say written to disk eg). At no point should the data breached the specified limit and at no point should flush size exceed the data breach.
Which essentially also means, while flush is ongoing, data writes should not be allowed. Since add/flush is operating on same data structure.

The algorithm works well when everything is single threaded its essentially as simple as

while(addToDataAndSizeBreached())
flushData()

However as expected in multithreaded applications this fails severely. Use case still works but at no point is the data breach respected ( eg : sometimes size of collection would exceed 25, sometimes size of flush would be anything but 25, etc etc) I went through and reread through my entire collection of java multithreaded dos and donts, however i am still not able to solve it.



The only thing that works well is taking a lock on the entire thing

ReentrantReadWriteLock.WriteLock writeLock = new ReentrantReadWriteLock().writeLock();
try{
    writeLock.lock();
    while(addToDataAndSizeBreached())
    flushData()
}finally{
writeLock.unlock();
}


However i am not satisfied with the above approach and wanted to check with you all for a better solution. Kindly guide, i am ok to research and attempt on my own post that
Things that i have tried:

  • Taking a complete lock on the entire thing : works well but looking for a better non blocking solution
  • Tried using a combination of read and write lock, but since if multiple thread have read lock,write lock waits..this ends up with data structure having a size more than 25
  • Tried approaching with synchronizers such as countdownlatch ( not sure how it would work since it does not reset), and cyclicbarrrier( could not understand how to use it , since i would not know the thread count before hand)
  • Tried the below solution as well, however data size is not respected and somehow also leads to deadlock, which i cant figure out why

  • package com.org.store.ecommerce;
    
    import java.util.ArrayList;
    import java.util.List;
    import java.util.UUID;
    import java.util.concurrent.CompletableFuture;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    import java.util.concurrent.atomic.AtomicBoolean;
    import java.util.concurrent.locks.ReentrantReadWriteLock;
    
    public class Runner {
    
        static ArrayList l = new ArrayList();
        static AtomicBoolean full = new AtomicBoolean(false);
        static AtomicBoolean flushOnGoing = new AtomicBoolean(false);
        static ReentrantReadWriteLock.WriteLock writeLock = new ReentrantReadWriteLock().writeLock();
    
        static void addToData(){
    
            while(addToDataAndCheckSize()){ // while size==maxsize : flushComplete.wait()
                System.out.println(String.format("Add attempt %s thread %d time  %d array size %d number of locks held", Thread.currentThread().getName(), System.nanoTime(), l.size(), writeLock.getHoldCount()));
               try{
                   writeLock.lock();
                   flushOnGoing.getAndSet(true);
                   flushData();
               }
                finally {
                   System.out.println(String.format("Flush complete %s thread %d time flushed %d flush size", Thread.currentThread().getName(), System.nanoTime(), l.size()));
                   flushOnGoing.getAndSet(false);
                   writeLock.unlock();
               }
    
            }
    
        }
    
        private static boolean flushData() {
            System.out.println(String.format("%s thread %d time flushed %d flush size %d number of locks held", Thread.currentThread().getName(), System.nanoTime(), l.size(), writeLock.getHoldCount()));
            l.clear();
            return l.size()==0;
        }
    
        private static boolean addToDataAndCheckSize() {
            System.out.println(String.format("%s thread %d time add to data ", Thread.currentThread().getName(), System.nanoTime()));
            String value = UUID.randomUUID().toString();
            if (!flushOnGoing.get())
                l.add(value);
            System.out.println(String.format("%s thread %d inside add size==%d",Thread.currentThread().getName(), System.nanoTime(),l.size()));
            return l.size()==2;
        }
    
        public static void main(String[] args) {
    //        addToData();
    //        addToData();
    //        addToData();
            ExecutorService ex = Executors.newFixedThreadPool(2);
            List<CompletableFuture<String>> cfList = new ArrayList<>();
            for(int n =0; n<=5;n++){
                  cfList.add(CompletableFuture.supplyAsync(()->{
                      addToData();
                      return "D";
                      }, ex));
    
            }
            for(CompletableFuture cf:cfList)
                cf.join();
    
        }
    
    
    }
    
    user1906450
    • 517
    • 1
    • 6
    • 19
    • I believe your full-locking solution is the best possible. You can't allow writes while flushing; you can't give up the lock earlier (or another 25 elements might race the first 25 elements). I don't think anything finer-grained will work. – Louis Wasserman Oct 20 '22 at 15:18
    • thank you Louis... there is actually a way cassandra achieves it... im trying to go through their codebase..bt kinda difficult to understand it...https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/dml/dmlHowDataWritten.html ... quote- Cassandra blocks writes until the next flush succeeds- end quote – user1906450 Oct 21 '22 at 07:18
    • 1
      If it "blocks writes until the next flush succeeds," then that's the same thing as your full lock. – Louis Wasserman Oct 21 '22 at 11:39
    • I think it's possible to implement this in a non-blocking way, but instead you would need to be able to signal a failed write (i.e., `addToData` needs to return something or throw an exception during the flush). Then, the calling thread could continue to do other work and retry adding later, or just give up, or whatever it wants to do. If you want it to wait until the data can be added, however, then a blocking lock acquisition is the best bet. – Tim Moore Oct 22 '22 at 06:04
    • Not a 100% sure but i was thinking , the issue with Countdownlatch is it does not reset. if i am able to reset the countdown latch, i think it is possible to block add while flush ongoing . still trying to figure out a way to create a resettable countdownlatch, and test it though. however i still think , even with that solution, there might be cases where size of the data structure would breach the limit of 25 :( – user1906450 Oct 22 '22 at 14:04

    0 Answers0