3

Normally, the INTEGER data type would suffice, but being in South Africa the ID numbers have a length of 13 and the INTEGER data type only goes up to 10. I am not fond of using characters like VARCHAR since it would not restrict the input ID number to integer values only. I only solution I see (other to using VARCHAR) is to use DECIMAL. Only problems that I see are that I can't restrict the max size like in VARCHAR and the data input could have ',' and '.' Any comments?

Frapie
  • 225
  • 1
  • 6
  • 14

4 Answers4

5

Assuming that you're referring to South African national ID numbers, which according to Wikipedia always have 13 digits, then I would go for CHAR(13) with a CHECK constraint (a CLR user-defined data type might also be an option).

The main reason is that the 'number' is not a number, it's an ID. You can't add, subtract, multiply etc. the values so there is no benefit in using a numeric data type. Furthermore, the ID is composed of components that have their own meaning, so being able to parse them out is presumably important (and easier when using character data types).

In fact, depending on how you use this data, you could also add columns that store the individual components of the ID (DOB, sequence, citizenship), either as computed columns or real columns. This could be convenient for querying and reporting (and indexing), especially if you converted the DOB to a date or datetime column.

Pondlife
  • 15,992
  • 6
  • 37
  • 51
  • 2
    Yes, you have three ways to go with IDs like this. 1) Assume every id you get is good. 2) Assume every id you get might be bad, and verify the checksum. 3) Assume every id you get might be bad, and verify the component parts. If I have my choice, I never assume every id is good. I'd verify the checksum at the very least. – Mike Sherrill 'Cat Recall' May 06 '13 at 20:05
4

Just use BIGINT, it ranges from -9223372036854775808 to 9223372036854775807 which should be enough for your application.

Luke Walsh
  • 281
  • 1
  • 7
4

I would indeed use VARCHAR with a CHECK that matches the format. You can even be more sophisticated if there is internal validation, e.g. a check digit. Now you are all set for other countries that have an alphabetic character, or if you need to handle a leading zero.

I wouldn't use an integer unless it makes sense to do some sort of arithmetic on the field, which is almost certainly not true here.

Andrew Lazarus
  • 18,205
  • 3
  • 35
  • 53
  • 1
    **Never**, ever store numbers in a varchar column. Not even with a check constraint that controls the format. –  May 06 '13 at 17:18
  • 1
    @a_horse_with_no_name, Would you care to supply a reason? What, except a few bytes, do you _gain_ by using numbers here? The ID isn't really a "number", you don't add or average it, it's a string with only numbers in it. And it's highly likely to _still_ need validation. – Andrew Lazarus May 06 '13 at 17:24
  • 1
    A lot of things can go wrong when storing numbers in character columns. Sorting comes to mind (10 will sort before 2) you cannot reliably compare them using `<` or `>` (or even `=` - think about `001` vs. `1`). You will find yourself casting the character values back and forth. And without a check constraint this will become an absolute nightmare anyway. *If* this is rather a "code" than a numeric ID, then it might make sense to use varchar but we don't know that I do agree with you there. –  May 06 '13 at 17:28
  • 2
    This has appeared with [zip codes](http://stackoverflow.com/questions/893454/is-it-a-good-idea-to-use-an-integer-column-for-storing-us-zip-codes-in-a-databas). – Andrew Lazarus May 06 '13 at 17:33
  • 1
    A ZIP code is *not* a "number" and should definitely be stored in a varchar column. –  May 06 '13 at 19:22
  • 1
    Well, a USA postal code is numeric. On the other hand, a California state ID card has an alphabetic prefix. I would view string equality and string sorting of both zip codes and ID fields as a feature, not a bug. – Andrew Lazarus May 06 '13 at 19:51
  • I'd guess there are much more countries where the ZIP code is **not** numeric. –  May 06 '13 at 20:05
  • A zip code isn't even numeric in the US: the [ZIP+4](http://en.wikipedia.org/wiki/ZIP_code#ZIP.2B4) format is 12345-6789, which definitely isn't a number – Pondlife May 06 '13 at 20:29
  • 1
    @Pondlife But its values are numeric and how people think about them is "number" in general. Andrew's point is that the data here is likely not _number_ and thus no-name-horse's concern is irrelevant/invalid. – TylerH Apr 14 '23 at 15:03
-2

You could use money as well, although it appears you only get 4 digits after the decimal place. The money type is 8 bytes, giving you a range from -922,337,203,685,477.5808 to 922,337,203,685,477.5807.

declare @num as money
select @num = '1,300,000.45'

select @num

Results in:

1300000.45

The parsing of commas and periods might be dependent on your specific culture settings, although I don't know that for sure.

chue x
  • 18,573
  • 7
  • 56
  • 70