I have a requirement to make a column of varchar(500) unique.
Putting unique constraint would not work as it crosses the size limit of unique.
Then, what would be the best strategy to make column unique?
-
1Are you stuck with the database structure or can you alter it? – six8 Oct 03 '11 at 18:24
-
Yes I can alter the structure. – Oct 03 '11 at 18:25
-
What is the size limitation on unique indexes? – James Johnson Oct 03 '11 at 18:27
-
By 500 I mean size larger than that, like if we put unique index on varchar(1000) field then it would give following warning: "Warning! The maximum key length is 900 bytes. The index 'UQ__tmp__43D2EAF819DFD96B' has maximum length of 1000 bytes. For some combination of large values, the insert/update operation will fail." – Oct 03 '11 at 18:32
-
2Size limit is 900 bytes for the key of an index. So varchar(900) or nvarchar(450). How many rows do you expect? – gbn Oct 03 '11 at 18:59
2 Answers
Create another field with the HASH of that varchar(500)
, and put a UNIQUE CONSTRAINT
on the hash field: HASHBYTES('md5', MyLongVarcharField)
This will cause poor performance but if you have a varchar(500)
where you need to enforce uniqueness I'm assuming performance isn't on the forefront of your considerations anyways.
EDIT:
To clarify, the chance of having two strings output the same 128 bit hash value is 1 in 340,282,366,920,938,000,000,000,000,000,000,000,000
. It's unlikely but not categorically impossible that you could have a collision.
If you are still concerned you can use SHA
or SHA1
algorithms which are 160 bits.

- 63,321
- 15
- 122
- 138
-
Won't hash collisions be a real issue with this solution, assuming there are enough rows of data? It would only be in really rare circumstances. – Matthew Vines Oct 03 '11 at 18:31
-
1Chances of a hash collision with md5 are 1/2^128. It's more likely the OP will be struck by lightning at the same time that a meteor hits him. – JNK Oct 03 '11 at 18:32
-
2^28 is 268,435,456 which is big, but not that big in certain systems. I agree it's not a deal breaker, but it's something to consider. – Matthew Vines Oct 03 '11 at 18:35
-
-
@JNK can you please provide some info or link on why to create HASH field and how it actually hits the performance? – Oct 03 '11 at 18:40
-
1@DarshanJoshi - **No.** For performance you will need to test, but it will negatively impact it since you are running a function against a long string for each row. For a reference, this is a very common technique. I'm not going to give you a citation on how a `WHERE` clause works either, but if you don't trust me you can check MSDN. – JNK Oct 03 '11 at 18:43
-
1One could use SHA1 to reduce collisions, both are subject to the birthday paradox anyway http://en.wikipedia.org/wiki/Birthday_paradox – gbn Oct 03 '11 at 19:18
You can use a DML trigger on INSERT and UPDATE. This gets around having to use a hash. I chose an AFTER trigger for simplicity, but this can be easily done with an INSTEAD OF trigger as well.
create trigger dml_EnforceUniqueVal
on dbo.UniqueBigColumn
after insert, update
as
declare @CountOfRepeats int
select @CountOfRepeats = COUNT(*)
from UniqueBigColumn
where somestring in
(
select somestring
from inserted
)
if @CountOfRepeats > 1
rollback
go