2

I am creating a script for Task Assignment to multiple users. A task is assigned to a user, if it is not accepted in the next 30 minutes, I need to re-assign to another user.

Actually, I used DynamoDB for this, where every task assignment is attached to TTL after 30 minutes. When TTL expired I handled through Stream and check if it is accepted. If not, I was re-assigning and creating a new entry in the table with 30 minutes TTL.

I missed one concept of TTL that it is not expiring items in real-time and could take up to 48 hours.

Is there any other smart way to handle such use cases, currently I implemented this as -

  • add an index with TTL as sort key and event_type = Task as a partition key
  • query all the records every minute where TTL less than the current epoch, and batch deleting those records.

There are two challenges here -

  • First, as event_type is constant across the table, all data is hitting one partition of the index, which is not good for big volumes
  • it's a poll mechanism where I actually need to scan through all records, this is also not a scalable solution

I am exploring if we can do it smartly by a push mechanism. Any pointers or help to solve this use case?

2 Answers2

2

I think cron jobs checking your database is a must. You won't escape that.

What you can do is add Global Secondary Index with your expiration date. Then you can use query instead of scan on sorted values.

karjan
  • 936
  • 1
  • 7
  • 17
  • And only project the attributes you need to make this decision to keep the GSI small and inexpensive as possible. – NoSQLKnowHow Oct 29 '20 at 18:30
  • Yes, I am doing the same as of now. Running a Lambda every minute to query the records less than the current epoch. As we cant use < or > operators in partition key query, hence put them into sort key. I can put expiration epoch as the partition key but masked to minute level and query that to optimize and scale it further. I am looking if we can do it on the seconds level and something in a smarter way. – Pranay Agarwal Nov 02 '20 at 12:48
  • 1
    If you need such accuracy, I'd go with step functions. Yan Cui wrote a great article about it https://theburningmonk.com/2019/06/step-functions-as-an-ad-hoc-scheduling-mechanism/ – karjan Nov 02 '20 at 16:13
0

Consider using temporal.io. It allows modeling your business logic directly in code. Here is how it would be done in Java:

@WorkflowInterface
public interface TaskAssignment {
    @WorkflowMethod
    void assignTask();

    @SignalMethod
    void reportAccepted(String userId);
}

@ActivityInterface
public interface TaskAssignmentActivities {

  String pickAvailableUser();

  void assignTask(String userId);
}

public class TaskAssignmentImpl implements TaskAssignment {

  private static final Duration TASK_ASSIGNMENT_TIMEOUT = Duration.ofHours(1);
  private static final int MAX_ASSIGNMENT_ATTEMPTS = 10;

  private ActivityOptions options =
      ActivityOptions.newBuilder().setStartToCloseTimeout(Duration.ofSeconds(10)).build();

  private final TaskAssignmentActivities activities =
      Workflow.newActivityStub(TaskAssignmentActivities.class, options);

  private String userId;

  @Override
  public void assignTask() {
    for (int i = 0; i < MAX_ASSIGNMENT_ATTEMPTS; i++) {
      String userId = activities.pickAvailableUser();
      activities.assignTask(userId);
      // Block up to TASK_ASSIGNMENT_TIMEOUT or user equality condition is satisfied.
      Workflow.await(TASK_ASSIGNMENT_TIMEOUT, () -> userId.equals(this.userId));
      if (userId.equals(this.userId)) {
        break;
      }
    }
  }

  @Override
  public void reportAccepted(String userId) {
    this.userId = userId;
  }
}

This looks like normal code, but Temporal makes it fully fault-tolerant. So if your processes are restarted the state of the computation is fully preserved including local variables and thread stacks. So no need to talk to a database or a queue and use cron jobs.

Maxim Fateev
  • 6,458
  • 3
  • 20
  • 35