4

How to implement a LRU Cache with Erlang?

LRU Cache Wiki

Top starred Github project was fogfish/cache, but Segmented table was not quite fit for my data.

barrel-db/erlang-lru was using a List. after testing, it would be slow if there were too much data.

I guess the problem was here.

move_front(List, Key) -> [Key | lists:delete(Key, List)].

With Java, a better implementation was using a hashmap and a linkedlist like this

I tried to do a linkedlist, and then realized that Linkedlist was not good idea for Erlang, like this thread.

the question is how to do a LRU cache with Erlang?

Community
  • 1
  • 1
user3644708
  • 2,466
  • 2
  • 16
  • 32
  • I think Erlang is too high level for doing low level cache and currently, Erlang has some similar features in core (like ETS http://erlang.org/doc/man/ets.html), so, have you tested some of these features before using external projects? – Mathieu Kerjouan Sep 26 '16 at 06:00
  • @MathieuK. thanks for you comments. Yes, I tried. the key problem is LRU. I tried to use a table to save the access_time, but for every read/update, I need to update (delete then insert) the table. I wonder if this could be done in a better method? – user3644708 Sep 27 '16 at 11:15
  • I don't have one answer to your question. If you want to implement performant LRU cache in Erlang, I guess one of the best approach is to use external code interconnected with [ports](http://erlang.org/doc/reference_manual/ports.html) or [NIF](http://erlang.org/doc/tutorial/nif.html). C programming isn't my favorite domain, but, if you want some example of implementing C code for Erlang, you can check [beam source code](https://github.com/erlang/otp/tree/maint/erts/emulator/beam). – Mathieu Kerjouan Sep 27 '16 at 12:14

2 Answers2

1

The first implementation of THE CACHE was based on ETS with two indexes. One ets table is hold TTL -> Key relation, another ets table is Key -> Object. You can see the implementation at

https://github.com/fogfish/cache/commit/8cc50bffb4178ad9ad716703507c3290e1f94821

The maintenance of two index was not efficient thus segmented cache outperform original implementation. I would not recommend to implement per-object TTL using Erlang data structures unless you can model your data within actors and accepts overhead. There is an implementation to address it. It is uses process per object concept:

https://github.com/fogfish/pts

Otherwise, you need to implement NIF

fogfish
  • 11
  • 1
0

I've implemented LRU cache using pseudo-time approach (full implementation is available here https://github.com/poroh/erl_lru)

I have two data structure:

  1. Unordered map for lookup: #{key() => {order(), value()}}
  2. Ordered map for item ordering: gb_tree(order(), key())

Where order() is pseudo-time:

  • It is incremented each time when new element is added or any element is updated
  • Each element that belongs to LRU has its own update time

Operations:

All operations has O(log(N)) complexity because of using gb_tree.

Add element (Key, Value):

  • Increment time (result is T)
  • Put Key => {T, Value} to unordered map
  • Put {T, Key} to ordered map
  • Check overflow

Update element (Key):

  • Find element in unordered map Key => {T0, Value}
  • Increment time (result is T)
  • Remove element Key from unordered map
  • Remove element T0 from ordered map
  • Add element (Key, Value) as above

Check overflow:

  • If number of elements in cache is more than possible maximum
    • take smallest (gb_tree:take_smallest) element in ordered map (T, Key)
    • remove element T from ordered map
    • remove element Key from unordered
Dmitry Poroh
  • 3,705
  • 20
  • 34