How to allow just one out of two simultaneous popping (SELECT + DELETE) on the same row of data in a database relation?

Question

I am currently developing a backend system that has two endpoints of concern that interacts with a common relational database table. The main purpose of this system is an after-registration email verification system that has a time limit.

Let's suppose there are three tables that contain the users that are pending verification, already verified, and out of time for verification. These tables will contain similar attributes of the users. One user (represented by a unique ID) should exist in only one of these tables.

The first endpoint is the verification endpoint, which will be triggered by the user through a verification link (e.g., www.hello.com/verify?token=XXXX). The to-be-verified user will be searched through the pending table. If not found, it means that the token is expired and nothing will be done after. Otherwise, it will be moved to the verified table. Moving, in this case, means that the selected row will be removed from the first table, and then will be inserted into the second table. Therefore, at least 3 queries will be executed as below, with the last two could be on a single transaction.

SELECT * FROM pending WHERE pending.id = id;

DELETE FROM pending WHERE pending.id = id;
INSERT INTO verified VALUES (what we get from SELECT);

The second endpoint is the expired users cleaning endpoint, which will be triggered by some kind of scheduler. Let's assume it will be triggered exactly when the user's verification token just expired. The overall task will be similar to the first endpoint, but the data row will be moved into the out of time table instead, and we assume that the user is already verified when we could not find the specified user when using SELECT.

SELECT * FROM pending WHERE pending.id = id;

DELETE FROM pending WHERE pending.id = id;
INSERT INTO outoftime VALUES (what we get from SELECT);

I believe the problem may arise if these two endpoints are unfortunately triggered at the same time (i.e., the user verify themselves right at the expiration time) by two concurrent processes. Both processes might manage to successfully find the user from SELECT before running DELETE. Therefore, they both will also run INSERT, causing the user data to be inserted into two tables, violating our rule (one user should exist in only one of these tables).

An ideal solution for me would be to find a way to detect and "fail" one of the two processes, which will produce a similar result to the more common situation where that process starts after another process has already done its job (i.e., the second process will terminate when it fails to retrieve a user from SELECT). The choice of the process to be failed is not significant in this case; either of the two would work.

I am aware that using locks is one of the possible solutions theoretically, by covering each critical section with a lock acquisition and release. However, I am not sure whether it is a good practice or not in this problem.

Are there any common design patterns or ideas that could solve this problem? Please note that no specific technology/database stacks have been chosen yet.

Thanks!

Edit: There are multiple tables in this case since I found that the frequency of access in each type of user may not equal, so we could use different system specifications for each table. For example, the out-of-time table is more like an archive--just a big pile of data with minimal access, while the active table will be accessed every time there are changes to the user; so they might require better hardware, etc. Using a status column seems to be one solution though. However, will there be a similar situation in system design where this kind of problem is inevitable? How it will be dealt with?

Why are you using multiple tables rather than a single table with a status column? — NickW, Jan 16 '22 at 09:45
This seems like a very messy solution and creates many unecessary issues, maintain a single table with a status column. — Stu, Jan 16 '22 at 11:42
There are multiple tables in this case since I found that the frequency of access in each type of user may not equal, so we could use different system specifications for each table. For example, the out-of-time table is more like an archive--just a big pile of data with minimal access, while the active table will be accessed every time there are changes to the user; so they might require better hardware, etc. Using a status column seems to be one solution though. However, will there be a similar situation in system design where this kind of problem is inevitable? How it will be dealt with? — comtalyst, Jan 16 '22 at 21:09
“will there be a similar situation in system design where this kind of problem is inevitable?” : no — NickW, Jan 16 '22 at 21:23

How to allow just one out of two simultaneous popping (SELECT + DELETE) on the same row of data in a database relation?

0 Answers0