5

When I create a list in Erlang, such as in the Erlang shell:

1> [1, 2].

From what I understand, in the vm this list will be represented as a singly linked list.

How is this structure created by the Erlang runtime? For example, is it constructed something like this:

  1. create a structure in memory to hold an list that terminates the list
  2. create a structure in memory to hold the item '2', and a reference to the empty list.
  3. create a structure in memory to hold the item '1', and a reference to item '2'.

Am I correct in thinking the following c and erlang code is where the bulk of the work is done?

erl_term.h contains a macro make_list but I haven't been able to find the implementation yet...

2240
  • 1,547
  • 2
  • 12
  • 30
Chris Snow
  • 23,813
  • 35
  • 144
  • 309

1 Answers1

4

The Erlang VM implementation BEAM uses a technique which dates back to first Lisp implementations back to the 60s or early 70s. It is sometimes referred as tagged or typed pointers. (Tags) This technique does not store type of a target in a target object (lists CONS in this case) but in the pointer itself or save a scalar value in the place where is usually a pointer. It allows save a quite good amount of memory especially in dynamically typed languages as LISP or Erlang. (It was interesting in old days when memory was very expensive and become important again nowadays when CPU become much faster than memory and cache miss/hit determines a speed of algorithms.) As a drawback, it also leads to a little bit confusing code. The whole part which deals with list construction starts at line 216 of erl_term.h. You can note there is macro

#define _unchecked_make_list(x) ((Uint) COMPRESS_POINTER(x) + TAG_PRIMARY_LIST)

which is macro you are looking for. It is your make_list. The line

_ET_DECLARE_CHECKED(Eterm,make_list,const Eterm*)

Would make a checked version of it when compiled with ET_DEBUG. (See more details.) Macro make_list

#define make_list(x)        _ET_APPLY(make_list,(x))

would just call the checked or unchecked version of make_list. The macros which really constructing list are

#define CONS(hp, car, cdr) \
        (CAR(hp)=(car), CDR(hp)=(cdr), make_list(hp))

#define CAR(x)  ((x)[0])
#define CDR(x)  ((x)[1])

The list cell structure is simply two consecutive Uint values on the heap (hp) which address is compressed and tagged (See _unchecked_make_list). I hope this description helps you.

Hynek -Pichi- Vychodil
  • 26,174
  • 5
  • 52
  • 73
  • Many thanks for this excellent answer. I'd like to dig deeper in understanding the underlying concepts behind the erlang implementation as you have described above. Would studying this minimal lisp implementation help me understand those concepts: https://gist.github.com/sanxiyn/523967 – Chris Snow Aug 06 '15 at 08:08
  • @ChrisSnow: This minimal implementation does not use the tagging technique. It uses normal structures with type stored inside the value object https://gist.github.com/sanxiyn/523967#file-lisp-c-L46 – Hynek -Pichi- Vychodil Aug 06 '15 at 08:47
  • Is there source code available that you are aware of for a lisp implementation that uses the tagging technique? – Chris Snow Aug 06 '15 at 10:24
  • 2
    @CrisSnow: I am not aware of LISP implementation which are using it currently. Keep in mind that there are more differences between Erlang and LISP which makes this technique more beneficial for Erlang. The most important is immutability which causes Erlang can't make cyclic structures which then leads to different GC which doesn't require any surplus data stored in values. You can found article about [Tagged pointer](https://en.wikipedia.org/wiki/Tagged_pointer) in Wikipedia with some links to LISP machines using [Tagged architecture](https://en.wikipedia.org/wiki/Tagged_architecture). – Hynek -Pichi- Vychodil Aug 06 '15 at 13:28