Is there a way to enforce unique constraint on a property (field) other than the primary key in dynamodb

Question

In dynamodb, if you want to enforce uniqueness in a field other than the primary key (like were you have a users table and want unique email addresses for users while primary key is a userid which is a number) is there a way other thans scanning the table to see if the email is already in use?

yadutaf · Accepted Answer · 2012-10-16T21:12:04.360

54

Short answer: No.

DynamoDB is a key:value store. It is very good at quickly retrieving/saving Items because it does a couple of compromise. This is a constraint you have to handle yourself.

Nonethess, depending on your actual model, it might be a good idea to use this field as you hash_key or consider using a range_key

If this is not possible, I advise you to de-normalize your data. You currently have something like:

UserTable

hash_key: user_id
e-mail
...

To ensure unicity, add a new table with this schema:

EmailUser

hash_key: e-mail
user_id

To make sure an e-mail is unique, just issue a GetItem to EmailUser before.

This kind of de-normalization is quite common with No-SQL databases.

edited Oct 16 '12 at 21:12

answered Oct 16 '12 at 18:50

yadutaf

6,840
1
37
48

1

so the question would be, does it make sense to make "email" a range key (or range attribute I think is the correct term) and keep the userid which is a number as primary key? – Ali Oct 16 '12 at 20:40
6

If you set ``e-mail`` the range_key, you will only allow a user_id to have multiple uniques e-mail. This is not what you want to do. If you can not use the e-mail as the user_id itself, you will need to de-normalize your data. I updated my answer with a possible way to achieve your goal. – yadutaf Oct 16 '12 at 21:14
1

What do I achieve this way? I still need to check the uniqueness of the email against the second table before adding it to the first (main user table), does it give me lower api calls? or faster responses? and I also need to keep the two tables in synch myself, right? – Ali Oct 17 '12 at 00:05
2

You're right. You need to maintain this index and manually ckeck the e-mail against it. Using this table as a "custom index" will spare you a Scan and replace it with 2 GET. This is not only much faster but also much cheaper. – yadutaf Oct 17 '12 at 14:45
2

Thanks, I sure miss my beloved postgresql. – Ali Oct 17 '12 at 15:25
24

Keeping the EmailUser information up to date can now be handled with a Global Secondary Index. However, this answer doesn't really explain how to deal with eventual consistency. What if two servers check for the email address, don't find it and then both insert a user with that email? – Jeff Walker Code Ranger May 12 '14 at 13:06
2

@JeffWalkerCodeRanger: So what do you do in that instance? What is the industry standard practice for dealing with eventually consistency on these scenarios, and is there a standard way of dealing with this via dynamo specifically? Perhaps you do not know yourself, but I thought I'd ask. – JayPrime2012 Jul 09 '15 at 18:21
1

@JayPrime2012: I know that Riak deals with this in the way that it's possible to get duplicate entries, and that you then deal with the situation on read. See allow_mult at http://docs.basho.com/riak/latest/dev/using/conflict-resolution/ – Alexander Torstling Sep 14 '15 at 07:14
1

@JeffWalkerCodeRanger if you use the EmailUser table instead of the Global Secondary Index, one of the servers won't be able to insert the record. Yes, you need to maintain that table, but just updating the email if the user wants to change it, shouldn't be a big deal. – Jesús Carrera May 11 '16 at 15:27
2

It seems like to deal with eventual consistency, you can make a "strongly consistent read" to ensure the read incorporates all the latest updates. Is that true? If so it would be handy info to put in this answer. @JayPrime2012 – wprl Apr 14 '17 at 20:51
1

This is a really important consideration to understand when building a user table in modern software design. Convention tells us that randomly generated GUID is good for both hash / design space and thus query speed when using Dynamo however now-a-days almost everything a user does is tied to their email address. When using a randomly generated hash key instead of their email address as the primary hash key than many issues arise. Personally, I think the issue if designing a user table or login system to use the email address as the primary key which can then be used to force uniqueness. – fIwJlxSzApHEZIl May 08 '17 at 19:10
1

@anon58192932 The email address would be great for a hash key as long as you do not need to have a foreign reference to the data. In any case, enforcing a unique constraint should be done as a hash key. If you need to reference the record in a foreign table, you should not have your primary hash key be anything that is subject to change. Allowing a user to change their email address is a common practice, which enforces the GUID strategy is the better option. Leaving the implementation to either have a separate table enforce uniqueness. – Steve Buzonas Jul 27 '17 at 00:48
1

@JeffWalkerCodeRanger Global Secondary Indexes are not unique. – Steve Buzonas Jul 27 '17 at 00:53
2

@SteveBuzonas indeed they aren't. After spending a good amount of time thinking about this issue I went with one table `User` hash `GUID` and one table `Email` hash `emailAddress`. Otherwise, as you said, a global secondary index on an email address as a secondary field in a `User` table would not enforce uniqueness. – fIwJlxSzApHEZIl Jul 28 '17 at 15:57
EmailUser only has email as hash key so it is guaranteed to blow out if you try to add same email for another user. So the GetItem is unneeded except for convenience to get the user_id. Yeah there is a race between 2 or more but guaranteed only one succeeds if adding to EmailUser is done first. – Samantha Atkins Sep 18 '19 at 23:20
This is actually an example of normalization not denormalization... – david_adler Jul 05 '21 at 12:58

score 7 · Answer 2 · answered Jan 10 '20 at 11:26

There is no such way you can enforce uniqueness on other attribute but using python's boto3 library you can use client to do transact item and use some composite key and try to duplicate your entry and insert all of them in a single go using transact_write_items(). There is very nice documentation by aws on this: https://aws.amazon.com/fr/blogs/database/simulating-amazon-dynamodb-unique-constraints-using-transactions/

score 5 · Answer 3 · answered Oct 17 '19 at 09:36

I don't think below mentioned approach is discussed or not, so posting the relevant link to ensure unique email(or any other attribute). This will not require creating another table but to add additional items into User table in primary key column.

Alternative approach to ensure unique attributes in DynamoDB

score 3 · Answer 4 · answered Jul 07 '19 at 16:49

3

The DynamoDB per se does not support unique constraints, but you can somehow ensure uniqueness by using atomic counter and incorporate this counter value into your data.

In my case I have to make sure both username and userId do not have duplicates. username is my partition key so I won't have any problems using the attribute_not_exists(username) to prevent overwrites;

For userId I will first retrieve a new value from the atomic counter and then put it as my userId value. It may not be completely sequential but it can guarantee uniqueness in this sense.

answered Jul 07 '19 at 16:49

MK Yung

4,344
6
30
35

2

Seems great for things that can be randomly generated. Less great for things that are not random but still need to be unique. – Ellesedil Dec 13 '19 at 23:37
This to me is a similar answer to just using a time based guid, which can be converted back and forth from a datetime to a guid. The problem comes when you need to do things in a distributed fashion. Now since using a non-standard guid per the specs it's missing meta data. – Urasquirrel Apr 22 '21 at 21:47

score 0 · Answer 5 · edited Jun 14 '22 at 03:49

You could implement unique index by yourself.

In short you need to create 2 records for each user. One would include the user value and the second one unique index value. Also it would be required to check conditional expression each time when you insert new or update the record.

Here the example of a single user records

{
  pk: 'some-user-id',
  record_type: 'record', // you need it to find record with you user props
  email: 'some@some.com'
}
{
  pk: 'unique-index#some@some.com',
  record_type: 'unique-index'
}

Each time when you would insert new user to db, you would need to insert both records using DynamoDB write transaction and check that there is no any other record with the same pk.

Is there a way to enforce unique constraint on a property (field) other than the primary key in dynamodb

5 Answers5

Linked