If there is a long url, I want to generate a short URL like those in Twitter. Is there some way to implement this in Ruby?
4 Answers
The easiest way is to:
- keep a database of all URLs
- when you insert a new URL into the database, find out the id of the auto-incrementing integer primary key.
- encode that integer into base 36 or 62 (digits + lowercase alpha or digits + mixed-case alpha). Voila! You have a short url!
Encoding to base 36/decoding from base 36 is simple in Ruby:
12341235.to_s(36)
#=> "7cik3"
"7cik3".to_i(36)
#=> 12341235
Encoding to base 62 is a bit tricker. Here's one way to do it:
module AnyBase
ENCODER = Hash.new do |h,k|
h[k] = Hash[ k.chars.map.with_index.to_a.map(&:reverse) ]
end
DECODER = Hash.new do |h,k|
h[k] = Hash[ k.chars.map.with_index.to_a ]
end
def self.encode( value, keys )
ring = ENCODER[keys]
base = keys.length
result = []
until value == 0
result << ring[ value % base ]
value /= base
end
result.reverse.join
end
def self.decode( string, keys )
ring = DECODER[keys]
base = keys.length
string.reverse.chars.with_index.inject(0) do |sum,(char,i)|
sum + ring[char] * base**i
end
end
end
...and here it is in action:
base36 = "0123456789abcdefghijklmnopqrstuvwxyz"
db_id = 12341235
p AnyBase.encode( db_id, base36 )
#=> "7cik3"
p AnyBase.decode( "7cik3", base36 )
#=> 12341235
base62 = [ *0..9, *'a'..'z', *'A'..'Z' ].join
p AnyBase.encode( db_id, base62 )
#=> "PMwb"
p AnyBase.decode( "PMwb", base62 )
#=> 12341235
Edit
If you want to avoid URLs that happen to be English words (for example, four-letter swear words) you can use a set of characters that does not include vowels:
base31 = ([*0..9,*'a'..'z'] - %w[a e i o u]).join
base52 = ([*0..9,*'a'..'z',*'A'..'Z'] - %w[a e i o u A E I O U]).join
However, with this you still have problems like AnyBase.encode(328059,base31)
or AnyBase.encode(345055,base31)
or AnyBase.encode(450324,base31)
. You may thus want to avoid vowel-like numbers as well:
base28 = ([*'0'..'9',*'a'..'z'] - %w[a e i o u 0 1 3]).join
base49 = ([*'0'..'9',*'a'..'z',*'A'..'Z'] - %w[a e i o u A E I O U 0 1 3]).join
This will also avoid the problem of "Is that a 0 or an O?" and "Is that a 1 or an I?".

- 296,393
- 112
- 651
- 745
-
6you have helped me for many times, i really appreciate your help, the answer every time you offer is great, thank you very very much. – ywenbo Jun 14 '11 at 05:47
-
1Be careful with this implementation, as it will generate human readable words that you might want to avoid. E.g. 645860.to_s(36) = 'duck'. 739172 might be considered a little worse. I recommend to go with base20 (0-9a-j) or similar. To help to decide, you can even use an anagram finder like 'wordplay' like so: `wordplay 'aaabbbcccdddeeefffggghhhiiijjj'` - it shows you which words can be generated. – user569825 Jun 10 '12 at 11:32
-
@user569825 Thanks for the note (though this seems like a poor reason to downvote). I'd suggest perhaps a base set of characters without any vowels as a simpler and less-impactful way. – Phrogz Jun 10 '12 at 13:24
-
@Phrogz Sorry was a little too quick with that while 'in the flow'. I'll try to upvote once edited. – user569825 Jun 11 '12 at 08:50
-
1Upvoted :) Regarding the case one might want to cosider this Google suggestion (https://code.google.com/p/shortlink/wiki/Specification) which states that URIs should be case-insensitive. That is also a good basic "counter-confusion" tactic. – user569825 Jun 12 '12 at 17:32
-
1For Ruby 2.0, replace the 3rd line of the decode method by: `string.reverse.chars.map.with_index.inject(0) do |sum,(char,i)|` – Camille Dec 29 '13 at 19:45
-
1base28 and base49 sets still include 0, 1 and 3 because `*0..9` yields integer array while `%w[0 1 3]` yielding string array (tested with ruby 2.1.5). Correct sets sould be: `base28 = ([*'0'..'9',*'a'..'z'] - %w[a e i o u 0 1 3]).join` and `base49 = ([*'0'..'9',*'a'..'z',*'A'..'Z'] - %w[a e i o u A E I O U 0 1 3]).join` (note the quotes around numbers). – Fatih Jun 09 '15 at 08:25
-
Thanks for the awesome code BTW, it makes me look cool to my project manager :) – Fatih Jun 09 '15 at 08:33
-
Thanks for the correction and compliment, Fatih. Updated code above to correct. – Phrogz Jun 09 '15 at 20:24
For Ruby 2.0, replace decode method by:
def self.decode( string, keys )
ring = DECODER[keys]
base = keys.length
string.reverse.chars.map.with_index.inject(0) do |sum,(char,i)|
sum + ring[char] * base**i
end
end

- 678
- 6
- 23
Well you could generate short URLs by using the APIs of the so many url shortening services. Almost all of the services that are out there provide an API for you to be able to invoke and shorten the url's, that is exactly how the twitter clients do as well. You should check out the particular url shortening service's website for further details.
If you want to create such a service on your own, that also can be quite simple, all you need to do is essentially maintain an internal mapping (in a database) between the original long url and a special short url (generated by you). And when you receive a request for a particular short url, you should then be able to get the original long url from the database and redirect the user to the same.

- 7,089
- 1
- 25
- 34