-1

Users of my application can authenticate via Facebook, so I store facebook uids in my database and when user logs in I need to query my database like this:

SELECT * FROM users WHERE uid = _SOME_UID_;

Now uid column is VARCHAR, but I think that I need to convert it to some numeric type BIGINT.

Why I think I need to do this:

  1. processor time: in general operations with a numeric (filtering/indexing) are always faster than same operations with string

  2. storage: numeric is smaller than corresponding string

  3. ammm... oauth-argument Facebook authentication is the only type of authentication I'm going to use (actualy, this is a canvas application) - so I don't need to care for UIDs that are non-numerics

And questions are:

  1. am I right?
  2. Can Facebook someday start using non-numeric uids?
Nikita Hismatov
  • 1,546
  • 1
  • 13
  • 30
  • I don't think they'll evere start using non-numeric IDs that would cause alot of trouble in their system (since, I guess, they use some kind of format-check on the userIDs aswel). Bigint will do just fine. – Joshua - Pendo Mar 29 '13 at 18:57
  • varchar can be used as a map to find the machine on which data is stored and other unfamiliar tricks can be used ( you could do the same with numbers but would be more complicated to get the right digit out) . I am guessing that the storage systems facebook use are more complicated than traditional databases and could potentially be using varchars to their advantage. Finally any software engineer thinks of using a number as a key as the initial thought. I am sure Facebook has thought along these lines and chosen to use varchar for good reason – Osama Javed Mar 29 '13 at 19:23

2 Answers2

1

I would stick with VARCHAR.

While it certainly can be stored as one, a Facebook UID in my opinion isn't really a number. It's a collection of characters that just so happen to all be numbers --but it's an identifier, not a number to be manipulated.

I wonder what the performance difference would really be. Find out that, and it will be easier to make a decision.

Jason Macgowan
  • 524
  • 7
  • 15
  • Well, roughly BIGINT is 8 bytes. UID in VARCHAR is 10 bytes. So, the storage is not the best argument. Maybe processor time in working with index is the key... – Nikita Hismatov Mar 29 '13 at 19:19
1

It depends on a couple of factors. First and foremost how big is your data collection likely to get are we talking thousands of UIDs or Millions? If you are not worried about the scale if your project then trying to find the smallest container (a large integer) then you are much more worried about being able to scale with changes.

So now you need to choose VarChar of a reasonable size which will take any format of UID and be happy- this includes leading zeros and bigInt which is easier to sanitise for but unfriendly to leading zeros and letters.

If you know that the UID is always a non-leading zero or zero padded number then you can get slightly better security by sanitising by way of casting to type. That and size are all the benefits of bigInt.

VarChar has the downfall of size - you need to know the biggest UID length before you start but leading zeros, letters and other unexpected content can be handled with the small exception that you are now cleaning a string.

From a programming point of view it makes very little odds as a UID is just a label that is unique. With an index on it I should not imagine that there is any speed difference in MySQL at least not with the common engines it might be using.

So best is simply a product of what you need most from the data.