Tracking the progress between Queues in a Map

Question

I have currently two queues and items traveling between them. Initially, an item gets put into firstQueue, then one of three dedicated thread moves it to secondQueue and finally another dedicated thread removes it. These moves obviously include some processing. I need to be able to get the status of any item (IN_FIRST, AFTER_FIRST, IN_SECOND, AFTER_SECOND, or ABSENT) and I implemented it manually by doing the update of the statusMap where the queue gets modified like

while (true) {
    Item i = firstQueue.take();
    statusMap.put(i, AFTER_FIRST);
    process(i);
    secondQueue.add(i);
    statusMap.put(i, IN_SECOND);
}

This works, but it's ugly and leaves a time window where the status is inconsistent. The inconsistency is no big deal and it'd solvable by synchronization, but this could backfire as the queue is of limited capacity and may block. The ugliness bothers me more.

Efficiency hardly matters as the processing takes seconds. Dedicated threads are used in order to control concurrency. No item should ever be in multiple states (but this is not very important and not guaranteed by my current racy approach). There'll be more queues (and states) and they'll of different kinds (DelayQueue, ArrayBlockingQueue, and maybe PriorityQueue).

I wonder if there's a nice solution generalizable to multiple queues?

Can't status be a part of Item, item.setStatus(AFTER_FIRST), instead of a separate status map — Hemant Patel, May 31 '18 at 13:26
@HemantPatel I'm afraid no as when inquiring, I create a new item which is `equals` to an existing one and do the map lookup. — maaartinus, May 31 '18 at 13:34
Are you looking for anything better than making the sequence of actions (like getting from queue and updating map | inserting into second queue and updating map) atomic? — Thiyagu, Jun 03 '18 at 14:04
Are you worried about blocking if there is no room in the second queue? Anyway that problem is there even now — Thiyagu, Jun 03 '18 at 14:05
@user7 Now, it's no problem, it's just one thread blocked. When using synchronization, the thread would get blocked by the queue while holding the monitor and everything would deadlock. — maaartinus, Jun 03 '18 at 14:25
talking of "multiple queues", you'd have to extend/enrich : `IN_FIRST, AFTER_FIRST, IN_SECOND, AFTER_SECOND, or ABSENT` ... are you still talking of "enum/constant/compile time known" structure...or is the number of queues/~states `== N` (dynamic)? — xerx593, Jun 10 '18 at 11:26
@xerx593 The number of queues is fixed, currently two, soon three and one day maybe four or five. — maaartinus, Jun 10 '18 at 12:16

score 3 · Accepted Answer · answered May 31 '18 at 14:48

3

Does it make sense to wrap the queues with logic to manage the Item status?

public class QueueWrapper<E> implements BlockingQueue<E> {
    private Queue<E> myQueue = new LinkedBlockingQueue<>();
    private Map<E, Status> statusMap;

    public QueueWrapper(Map<E, Status> statusMap) {
        this.statusMap = statusMap;
    }

    [...]
    @Override
    public E take() throws InterruptedException {
        E result = myQueue.take();
        statusMap.put(result, Status.AFTER_FIRST);
        return result;
    }

That way status management is always related to (and contained in) queue operations...

Obviously statusMap needs to be synchronized, but that would be an issue anyway.

answered May 31 '18 at 14:48

DavidW

1,413
10
17

This makes it better, but I'm looking for a substantial improvement. – maaartinus Jun 02 '18 at 14:27
@maaartinus please elaborate on what you are looking for, as this and fabien-leborgne's answer seem to satisfy your request. – Lucas Ross Jun 06 '18 at 19:16
@LucasRoss This queue sets `IN_FIRST` and `AFTER_FIRST` statuses, the next one sets `IN_SECOND` and `AFTER_SECOND`... The two statuses are to be passed to the `QueueWrapper`'s constructor. OK, that's no more ugly. The concurrency problem remains, but that's OK. I guess, I'll accept this answer (as it pre-dates the other). – maaartinus Jun 06 '18 at 21:06
@maaartinus About the concurrency issue that the point treated by wrapping the item instead of the status map. you cannot dequeue an already dequeued item. Moreover you don't dequeue according the status, this status is only for informative purpose. As the same way if you use a statusMap, there is not reason to synchronize it if two items cannot be equals because there is only one thread which dequeue in each queue... – Fabien Leborgne Jun 07 '18 at 19:13
@FabienLeborgne I don't understand. But maybe you can understand [this comment of mine](https://stackoverflow.com/questions/50625154/tracking-the-progress-between-queues-in-a-map/50627079?noredirect=1#comment88531369_50731681). – maaartinus Jun 08 '18 at 13:43

Aristofanio Garcia · Answer 2 · 2018-06-09T20:07:45.803

I see that your model might be improved in consistency, state control, and scaling.

A way of to implement this is accouple the item to your state, enqueue and dequeue this couple and create a mechanism to ensure state change.

My proposal can be see in figure below:

According with this model and your example, we can to do:

package stackoverflow;

import java.util.concurrent.LinkedBlockingQueue;

import stackoverflow.item.ItemState;
import stackoverflow.task.CreatingTask;
import stackoverflow.task.FirstMovingTask;
import stackoverflow.task.SecondMovingTask;

public class Main {

    private static void startTask(String name, Runnable r){
        Thread t = new Thread(r, name);
        t.start();
    }

    public static void main(String[] args) {
        //create queues
        LinkedBlockingQueue<ItemState> firstQueue = new LinkedBlockingQueue<ItemState>();
        LinkedBlockingQueue<ItemState> secondQueue = new LinkedBlockingQueue<ItemState>();
        //start three threads
        startTask("Thread#1", new CreatingTask(firstQueue));
        startTask("Thread#2", new FirstMovingTask(firstQueue, secondQueue));
        startTask("Thread#3", new SecondMovingTask(secondQueue));
    }
}

Each task runs the operations op() of according with below affirmation on ItemState:

one of three dedicated thread moves it to secondQueue and finally another dedicated thread removes it.

ItemState is a immutable object that contains Item and your State. This ensures consistency between Item and State values.

ItemState has acknowledgement about the next state creating a mechanism of self-controled state:

public class FirstMovingTask {
    //others codes
    protected void op() {
            try {
                //dequeue
                ItemState is0 = new ItemState(firstQueue.take());
                System.out.println("Item " + is0.getItem().getValue() + ": " + is0.getState().getValue());
                //process here
                //enqueue
                ItemState is1 = new ItemState(is0);
                secondQueue.add(is1);
                System.out.println("Item " + is1.getItem().getValue() + ": " + is1.getState().getValue());
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    //others codes
}

With ItemState implemetation:

public class ItemStateImpl implements ItemState {
    private final Item item;
    private final State state;

    public ItemStateImpl(Item i){
        this.item = i;
        this.state = new State();
    }

    public ItemStateImpl(ItemState is) {
        this.item = is.getItem();
        this.state = is.getState().next();
    }

    // gets attrs
}

So this way is possible build solutions more elegant, flexible and scalable. Scalable because you can to control more states only changing next() and generalizing the moving task for increase the number of queue.

Results:

Item 0: AFTER_FIRST
Item 0: IN_FIRST
Item 0: IN_SECOND
Item 0: AFTER_SECOND
Item 1: IN_FIRST
Item 1: AFTER_FIRST
Item 1: IN_SECOND
Item 1: AFTER_SECOND
Item 2: IN_FIRST
Item 2: AFTER_FIRST
Item 2: IN_SECOND
... others

UPDATE(06/07/2018): analysing the use of map for search Search in map using equals values like comparator might not work because usally the mapping between values and identity (key/hash) is not one-to-one(see figure bellow). In this way is need to create an sorted list for search values which results in O(n) (worst-case).

with Item.getValuesHashCode():

private int getValuesHashCode(){
  return new HashCodeBuilder().append(value).hashCode();
}

In this case, you must keep Vector<ItemState> instead of Item and to use the key like the result of getValuesHashCode. Change the mechanism of state-control for keep first reference of the Item and the state current. See bellow:

//Main.class
public static void main(String[] args) {
    ... others code ...

    //references repository
    ConcurrentHashMap<Integer, Vector<ItemState>> statesMap = new ConcurrentHashMap<Integer, Vector<ItemState>>();
    //start three threads
    startTask("Thread#1", new CreatingTask(firstQueue, statesMap));

    ... others code ...
}

//CreateTask.class
protected void op() throws InterruptedException {
    //create item
    ItemState is = new ItemStateImpl(new Item(i++, NameGenerator.name()));
    //put in monitor and enqueue
    int key = is.getHashValue();
    Vector<ItemState> items = map.get(key);
    if (items == null){
        items = new Vector<>();
        map.put(key, items);
    }
    items.add(is);
    //enqueue
    queue.put(is);
}

//FirstMovingTask.class
protected void op() throws InterruptedException{
    //dequeue
    ItemState is0 = firstQueue.take();
    //process
    ItemState is1 = process(is0.next());
    //enqueue 
    secondQueue.put(is1.next());
}

//ItemState.class
public ItemState next() {
    //required for consistent change state
    synchronized (state) {
        state = state.next();
        return this;
    }
}

To search you must use concurrentMapRef.get(key). The result will the reference of updated ItemState.

Results in my tests for :

# key = hash("a")
# concurrentMapRef.get(key)
...
Item#7#0    : a - IN_FIRST 
... many others lines
Item#7#0    : a - AFTER_FIRST 
Item#12#1   : a - IN_FIRST 
... many others lines
Item#7#0    : a - IN_SECOND 
Item#12#1   : a - IN_FIRST 
... many others lines
Item#7#0    : a - AFTER_SECOND 
Item#12#1   : a - IN_FIRST

More details in code: https://github.com/ag-studies/stackoverflow-queue

UPDATED IN 06/09/2018: redesign

Generalizing this project, I can undestand that the state machine is something like:

In this way I decoupled the workers of the queues for improve concepts. I used an MemoryRep for keep the unique reference for item in overall processment. Of course that you can use strategies event-based if you need keep ItemState in a physic repository.

This keep the previous idea and creates more legibility for the concepts. See this:

I understand that each job will have two queue (input/output) and relationship with a business model! The researcher will always find the most updated and consistent state of Item.

So, answering your ask:

I can find the consistent state of Item anywhere using MemoryRep (basically an Map), wrapping state and item in ItemState, and controlling the change state on job on enqueue or dequeue it.
The performace is keeped, except on running of next()
The state is allways consistent (for your problem)
In this model is possible use any queue type, any number of jobs/queues, and any number of state.
Additionaly this is beautiful!!

There are good ideas there, but I'm missing one substantial feature: Given an `Item seachItem`, which is `equals` but not identical to an `item` somewhere in the queues or between them, find out the the state. See [my comment](https://stackoverflow.com/questions/50625154/tracking-the-progress-between-queues-in-a-map/50731681#comment88261144_50625154). — maaartinus, Jun 07 '18 at 12:44
Please, clarify me about! ```searchItem``` must be complex of O(1)? — Aristofanio Garcia, Jun 07 '18 at 14:41
If exists item#1 with value "x" and state "IS_FIRST" so can exists item#56 with value "x"and state "AFTER_SECOND". ```searchItem("x")``` results [item#1[IS_FIRST], item#56[AFTER_SECOND]]. Is it that you want? — Aristofanio Garcia, Jun 07 '18 at 15:43
The complexity of `searchItem` doesn't matter much, but scanning all queues is ugly and prone to race conditions. Let's say, my items were strings read from an URL. Let's say, I get `["x", "y"] back. Some time later, I re-read the URL and get `["x", "z"]`. I need to find out in what state `"x"` and `"z"` are in order to decide what to do. — maaartinus, Jun 08 '18 at 13:39
I think that understant! I will do it here! Please, when occurs ABSENT? — Aristofanio Garcia, Jun 08 '18 at 15:01
When I ask for an `"z"` never seen anywhere before. Or for any item having traveled till the end (where it was dropped). — maaartinus, Jun 08 '18 at 20:48

Fabien Leborgne · Answer 3 · 2018-06-03T10:35:08.650

As previously answered, Wrap the queues or the item would be viable solutions or both.

public class ItemWrapper<E> {
   E item;
   Status status;
   public ItemWrapper(Item i, Status s){ ... }
   public setStatus(Status s){ ... }
   // not necessary if you use a queue wrapper (see queue wrapper)
   public boolean equals(Object obj) {
     if ( obj instanceof ItemWrapper)
       return item.equals(((ItemWrapper) obj).item) 
     return false;
   }
   public int hashCode(){
     return item;
   }
}
...
process(item) // process update status in the item
...

Probably a better way, already answered, is to have a QueueWrapper who update the queue status. For the fun I don't use a status map but I use the previously itemwrapper it seems cleaner (a status map works too).

public class QueueWrapper<E> implements Queue<E> {
  private Queue<ItemWrapper<E>> myQueue;
  static private Status inStatus; // FIRST
  static private Status outStatus; // AFTER_FIRST
  public QueueWrapper(Queue<E> myQueue, Status inStatus, Status outStatus) {...}
  @Override
  public boolean add(E e) {
    return myQueue.add(new ItemWrapper(e, inStatus));
  }
  @Override
  public E remove(){
    ItemWrapper<E> result = myQueue.remove();
    result.setStatus(outStatus)
    return result.item;
  }
  ...  
  }

You can also use AOP to inject status update in your queues without changing your queues (a status map should be more appropriate than itemwrapper).

Maybe I didn't answer well your question because an easy way to know where is your item could be to check in each queue with "contains" function.

score 1 · Answer 4 · answered Jun 07 '18 at 20:39

Here's something different from what others have said. Taking from the world of queue services and systems we have the concept of message acknowledgement. This is nice, because it also gives you some built in retry logic.

I'll lay out how it would work from a high level, and if you need I can add code.

Essentially you'll have a Set to go with each of your queues. You'll wrap your queues in an object so that when you dequeue an item a few things happen

The item is removed from the queue
The item is added to the associated set
A task (lambda containing an atomic boolean (default false)) is scheduled. When run it will remove item from the set and if the boolean is false, put it back in the queue
The item and a wrapper around the boolean are returned to the caller

Once process(i); completes, your code will indicate receipt acknowledgement to the wrapper, and the wrapper will remove the item from the set and make the boolean false.

A method to return status would simply check which queue or set the item is in.

Note that this gives "at least once" delivery, meaning an item will be processed at least once, but potentially more than once if the processing time is too close to the timeout.

Tracking the progress between Queues in a Map

4 Answers4