3

I am creating a real time stock trading system and would like to provider the user with a human readible, user friendly way to refer to their orders. For example the ID should be like 8 characters long and only contain upper case characters e.g. Z9CFL8BA. For obvious reasons the id needs to be unique in the system.

I am using MongoDB as the backend database and have evaluated the following projects which do not meet my requirements.

hashids.org - this looks good but it generates ids which are too long:

var mongoId = '507f191e810c19729de860ea';
var id = hashids.encodeHex(mongoId);
console.log(id)

which results in: 1E6Y3Y4D7RGYHQ7Z3XVM4NNM

github.com/dylang/shortid - this requires that you specify a 64 character alphabet, and as mentioned I only want to use uppercase characters.

I understand that the only way to achieve what I am looking for may well be by generating random codes that meet my requirements and then checking the database for collisions. If this is the case, what would be the most efficient way to do this in a nodejs / mongodb environment?

Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
user3391835
  • 305
  • 1
  • 3
  • 14

2 Answers2

0

You're attempting to convert a base-16 (hexadecimal) to base-36 (26 characters in alphabet plus 10 numbers). A simple way might be to simply use parseInt's radix parameter to parse the hexadecimal id, and then call .toString(36) to convert that into base-36. Which would turn "507f191e810c19729de860ea" into "VDFGUZEA49X1V50356", reducing the length from 24 to 18 characters.

function toBase36(id) {
  var half = Math.floor(id.length / 2);
  var first = id.slice(0, half);
  var second = id.slice(half);
  return parseInt(first, 16).toString(36).toUpperCase()
       + parseInt(second, 16).toString(36).toUpperCase();
}

function toBase36(id) {
  var half = Math.floor(id.length / 2);
  var first = id.slice(0, half);
  var second = id.slice(half);
  return parseInt(first, 16).toString(36).toUpperCase()
       + parseInt(second, 16).toString(36).toUpperCase();
}

// Ignore everything below (for demo only)
function convert(e){ if (e.target.value.length % 2 === 0) base36.value = toBase36(e.target.value) }
var base36 = document.getElementById('base36');
var hex = document.getElementById('hex');
document.getElementById('hex').addEventListener('input', convert, false);
convert({ target: { value: hex.value } });
input { font-family: monospace; width: 15em; }
<input id="hex" value="507f191e810c19729de860ea">
<input id="base36" readonly>
idbehold
  • 16,833
  • 5
  • 47
  • 74
  • Thank you for your reply and code. However 18 characters is still quite a bit too long. I needs something ideally around 8 characters long. – user3391835 May 21 '15 at 10:02
0

I understand that the only way to achieve what I am looking for may well be by generating random codes that meet my requirements and then checking the database for collisions. If this is the case, what would be the most efficient way to do this in a nodejs / mongodb environment?

Given your description, you use 8 chars in the range [0-9A-Z] as "id". That is 36⁸ combinations (≈ 2.8211099E12). Assuming your trading system does not gain insanely huge popularity in the short to mid term, the chances of collision are rather low.

So you can take an optimistic approach, generating a random id with something along the lines of the code below (as noticed by @idbehold in a comment, be warn that Math.random is probably not random enough and so might increase the chances of collision -- if you go that way, maybe you should investigate a better random generator [1])

> rid = Math.floor(Math.random()*Math.pow(36, 8))
> rid.toString(36).toUpperCase()
30W13SW

Then, using a proper unique index on that field, you only have to loop, regenerating a new random ID until there is no collision when trying to insert a new transaction. As the chances of collision are relatively small, this should terminate. And most of the time this will insert the new document on first iteration as there was no collision.

If I'm not too wrong, assuming 10 billion of transactions, you still have only 0.3% chance of collision on first turn, and a little bit more than 0.001% on the second turn


[1] On node, you might prefer using crypto.pseudoRandomBytes to generate your random id. You might build something around that, maybe:

> b = crypto.pseudoRandomBytes(6)
<SlowBuffer d3 9a 19 fe 08 e2>
> rid = b.readUInt32BE(0)*65536 + b.readUInt16BE(4)
232658814503138
> rid.toString(36).substr(0,8).toUpperCase()
'2AGXZF2Z'
Community
  • 1
  • 1
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • Be wary of `Math.random()` as it is not sufficiently random and will certainly increase your chances of collision compared to a better random number generator. – idbehold May 20 '15 at 18:57
  • @idbehold Yes that is correct. Maybe do you know a better random generator available on JavaScript ? – Sylvain Leroux May 20 '15 at 19:02
  • On node you can use [`crypto.randomBytes()`](https://nodejs.org/api/crypto.html#crypto_crypto_randombytes_size_callback). – idbehold May 20 '15 at 19:18
  • @idbehold Thank you for suggesting that. In this case, [crypto.pseudoRandomBytes](https://nodejs.org/api/crypto.html#crypto_crypto_pseudorandombytes_size_callback) is probably a better choice as we don't need cryptographically strong pseudo-random data -- and the OP will probably *not* want to block if the entropy pool is exhausted. – Sylvain Leroux May 20 '15 at 20:42
  • Thank you all for for the replies - do you have any suggestions of how you would handle the collision resolution process in a multi-threaded environment where many threads may check the table of used ids ? I understand that this is going to become less and less performant as time goes on and the table of already used ids grows larger and larger. What is the best way to implement this in mongodb? – user3391835 May 21 '15 at 10:10