1

Wikipedia page for rainbow tables says:

"this use of multiple reduction functions approximately doubles the speed of lookups."

Assuming the "Average" position in the chain, we take a hash and run it through a 9 iteration chain...

The original table runs it through 4 reductions and 4 hashes and finds the end of the chain, then looks it up for another 5 hashes 5 reductions... total 9 hashes 9 reductions

The rainbow table runs it through Rk-1, Rk-2, Rk-3, and Rk-4 calculations to find the end of the chain, then another 5 hashes 5 reductions to get the plaintext: total 15 hashes 15 reductions...

What am I missing here? By my math the only time a rainbow lookup is even the same speed as a normal table is when the hash just happens to be at the very end of the chain... In fact the RT should be incrementally slower the further towards the beginning the hash lies...

A 5k chain with the hash at the beginning should be approx 2500 times slower with rainbow tables than with normal hash tables...

Am I missing something or did Wikipedia make a mistake? (The paper referenced on that page (Page 13) would also be wrong, so I'm leaning towards the former)

J V
  • 11,402
  • 10
  • 52
  • 72

2 Answers2

2

The purpose of rainbow tables isn't to necessarily be faster but to reduce space. Rainbow tables trade speed for size.

Storing hashes for all possible 10 digit passwords for example would be prohibitively expensive in terms of disk space. Also you need to consider that since the dictionary space is so large it will require significant paging (very slow operation).

Rainbow tables are more CPU intensive but they are much much much smaller requiring less disk space and also allowing more of the potential dictionary space in memory at one time. Keep in mind that means in the real world higher potential performance on large dictionary spaces due to less paging (disk reads are prohibitively slow).

Here is a better illustrated example: http://kestas.kuliukas.com/RainbowTables/

Of course this is all academic. Rainbow tables provide no value against well designed security systems. 1) Use a cryptographically secure algorithm (no "roll your own") 2) Use a key derivation function (with thousands of iterations) to slow attackers hash throughput. 3) Use large (32 to 64 bit) random salt. Rainbow tables can no longer be precomputed, nor can that computation be used for any other system (unless they happen to share same salt. 4) If possible use different salt per record thus making rainbow table completely invalid.

Gerald Davis
  • 4,541
  • 2
  • 31
  • 47
  • I am referring to the wikipedia quote that says that rainbow tables are faster than *other* lookup tables, specifically those which use ordinary single-reduction function hash chains. – J V Nov 15 '10 at 15:31
  • 1
    The statement is misleading and ambiguous. A rainbow table will take multiple steps to determine a match. Given infinite amount of memory a rainbow table would be orders of magnitudes slower. However memory constraints are real and access to the disk is prohibitively slow. However constrained by realistic limits on memory and realistic disk speeds a rainbow table **CAN** end up being faster despite it requiring multiple lookups per key. It depends on available memory vs the size of the dictionary space. Smaller the space (like looking up short common passwords) the less useful a rainbow – Gerald Davis Nov 15 '10 at 15:41
  • Could you explain this to me? What factor changes with a rainbow table that makes it faster? As I said in OP, by my math the only time a RT will be even equal to a comparable lookup table is if it's length **1**. What makes up the difference? Chain count? Chain length? – J V Nov 15 '10 at 16:07
  • 1
    Time in terms of N a rainbow table is **always slower**. However that ignores the real world where data must be spooled from the disc (horribly slow). For large dictionary spaces it is simply not possible to have even a significant fraction of data in memory. For example <=8 char permutations of upper + lower + number along with 256bit hash would require ~40 Petabytes of storage. As the size of dictionary space rises the amount of time spending paging aproaches 100%. A rainbow table is a compromise between speed and memory. While one could make a ~40PB array you won't have it in memory. – Gerald Davis Nov 15 '10 at 16:24
  • No I'm not referring to a complete list of all hashes and their accompanying passwords, I'm referring to the difference in speed between a rainbow table and a table that uses a constant reduction function (And is hence open to merge) – J V Nov 15 '10 at 16:32
  • 2
    A table which uses a constant reduction function (simple hash chain table) is subject to merge. Merged chains become ambiguous. To avoid this one needs to limit the length of chain and that requires multiple lookup tables. Having to lookup across multiple tables increases the cost. This is specifically the issue that rainbow tables were created to resolve. By having each reduction unique one can eliminate potential for merges, use a single unified table, and use longer chains. – Gerald Davis Nov 15 '10 at 16:41
  • So the increased length of the chains is what makes rainbow tables faster then? – J V Nov 15 '10 at 19:47
  • 2
    More distinctly. Having a separate reduction for each step is what **allows** longer chains without collision/merge. While longer chain does result in more steps it is preferable to needing multiple tables (simple hash chain). Since the R at each step is unique the only way for a pair of chains to merge is if both chains resolve to same final value. This is very rare but does happen. To compensate during the creation of a rainbow table the chains are sorted and duplicate final values are replaced with new chains (select slightly different starting key). – Gerald Davis Nov 15 '10 at 20:18
  • So exponential reduction function increase is ofset by needing lesser tables, sounds good to me! Answered :) – J V Nov 16 '10 at 15:32
0

All the answers are in the original paper. First of all, you must see that you must compare a single rainbow table with t classical tables, t being the number of elements in a chain. Indeed, each column in the rainbow table acts like a single classical table (e.g. if you have to identical elements in a column of a rainbow table, you will have a merge, if you have two identical elements in a classical table you also have a merge). Then you see that for searching in t classical tables you would need t^2 operations if you have to go through all the tables (t tables with chains of length t). If you search in the single rainbow table you will need 1+2+3+...+t operations which is equal to t^2/2. So in the worst case, where you don't find the password you will be two times faster. Now if the password shows up in average after you have gone through half of the tables or columns then it will be 4 times faster. If you want a high probability of success (e.g. 99%) then in average a password would already show up after 10% of the table, making rainbow tables 20x faster.

phish
  • 1