7

How does lua handle a table's growth?

Is it equivalent to the ArrayList in Java? I.e. one that needs continuous memory space, and as it grows bigger than the already allocated space, the internal array is copied to another memory space.

Is there a clever way to led with that?

My question is, how is a table stored in the memory? I'm not asking how to implement arrays in Lua.

Al.G.
  • 4,327
  • 6
  • 31
  • 56
Johnny Willer
  • 3,717
  • 3
  • 27
  • 51
  • 1
    *We implement arrays in Lua simply by indexing tables with integers. Therefore, arrays do not have a fixed size, but grow as we need.* http://www.lua.org/pil/11.1.html – Robert Harvey Apr 28 '15 at 19:40
  • 1
    @RobertHarvey as I have edited my question, my doubt is how Lua handles this is "memory vision". – Johnny Willer Apr 28 '15 at 19:57
  • "array" is an "under the hood" term. A Lua table is a set of key-value pairs where neither the key nor the value can be `nil`. If a table has positive integer keys that are contiguous and start at 1, the table is said to "have a sequence." The length operator `#` only applies to sequences (or "empty sequences") and `ipairs` ends before the first missing positive integer key. The implementation possibilities follow from that. – Tom Blodget Apr 28 '15 at 20:45

2 Answers2

8

(Assuming you're referring to recent versions of Lua; describing the behavior of 5.3 which should be (nearly?) the same for 5.0-5.2.)

Under the hood, a table contains an array and a hash part. Both (independently) grow and shrink in power-of-two steps, and both may be absent if they aren't needed.

Most key-value pairs will be stored in the hash part. However, all positive integer keys (starting from 1) are candidates for storing in the array part. The array part stores only the values and doesn't store the keys (because they are equivalent to an element's position in the array). Up to half of the allocated space is allowed to be empty (i.e. contain nils – either as gaps or as trailing free slots). (Array candidates that would leave too many empty slots will instead be put into the hash part. If the array part is full but there's leftover space in the hash part, any entries will just go to the hash part.)

For both array and hash part, insertions can trigger a resize, either up to the next larger power of two or down to any smaller power of two if sufficiently many entries have been removed previously. (Actually triggering a down-resize is non-trivial: rehash is the only place where a table is resized (and both parts are resized at the same time), and it is only called from luaH_newkey if there wasn't enough space in either of the two parts1.)

For more information, you can look at chapter 4 of The Implementation of Lua 5.0, or inspect the source: Basically everything of relevance happens in ltable.c, interesting starting points for reading are rehash (in ltable.c) (the resizing function), and the main interpreter loop luaV_execute (in lvm.c) or more specifically luaV_settable (also there) (what happens when storing a key-value pair in a table).


1As an example, in order to shrink a table that contained a large array part and no hash, you'd have to clear all array entries and then add an entry to the hash part (i.e. using a non-integer key, the value may be anything including nil), to end up with a table that contains no array part and a 1-element hash part.
If both parts contained entries, you'd have to first clear the hash part, then add enough entries to the array part to fill both array and hash combined (to trigger a resize which will leave you with a table with a large array part and no hash), and subsequently clear the array as above.2 (First clearing the array and then the hash won't work because after clearing both parts you'll have no array and a huge hash part, and you cannot trigger a resize because any entries will just go to the hash.)

2Actually, it's much easier to just throw away the table and make a new one. To ensure that a table will be shrunk, you'd need to know the actual allocated capacity (which is not the current number of entries, and which Lua won't tell you, at least not directly), then get all the steps and all the sizes just right – mix up the order of the steps or fail to trigger the resize and you'll end up with a huge table that may even perform slower if you're using it as an array… (Array candidates stored in the hash also store their keys, for half the amount of useful data in e.g. a cache line.)

nobody
  • 4,074
  • 1
  • 23
  • 33
  • 1
    I believe only insertions can trigger a resize. This is why it is allowed to clear existing fields during an iteration using [next()](http://www.lua.org/manual/5.1/manual.html#pdf-next), but it is undefined if you add fields. – siffiejoe Apr 28 '15 at 22:03
  • First, thanks for your help, but sorry, i'm dummy in C, I have read that file Itable.c but i'm didn't understand some things: what happens if a key hash colide? if the hash part grows (or shrink) is it copied to another part of memory (to preserve continuous allocation)? I can't find a detailed explaination about that – Johnny Willer Apr 29 '15 at 13:01
  • @JohnnyWiller: The hashing is (briefly) explained in chapter 4 of the implementation paper (see page 8, last paragraph of that chapter) and more information can be found in [the paper referenced there](http://maths-people.anu.edu.au/~brent/pd/rpb013.pdf). As for resizing: Did you read the implementation paper? It's described there (page 7, 2nd paragraph) – yes copied, yes contiguous. – nobody Apr 30 '15 at 03:05
  • @siffiejoe Right, completely skipped thinking on that part – thanks for noticing! Should be fixed now, and after more reading & testing I'm pretty sure that what I'm describing now is really what happens. – nobody Apr 30 '15 at 03:12
1

Since Lua 5.0, tables are an hybrid of hash table and array. From The Implementation of Lua 5.0:

New algorithm for optimizing tables used as arrays:

Unlike other scripting languages, Lua does not offer an array type. Instead, Lua programmers use regular tables with integer indices to implement arrays. Lua 5.0 uses a new algorithm that detects whether tables are being used as arrays and automatically stores the values associated to numeric indices in an actual array, instead of adding them to the hash table. This algorithm is discussed in Section 4.

Prior versions had only the hash table.

Diego
  • 1,789
  • 10
  • 19
  • @JohnnyWiller What about it? The GC scans both the array part and the hash part of the table. – Colonel Thirty Two Apr 28 '15 at 21:03
  • @ColonelThirtyTwo yes, I know that. I was asking about if table hash part grows, is it copied to another part of the memory, to preserve continuous allocation (like java do)? – Johnny Willer Apr 29 '15 at 13:06
  • @JohnnyWiller `After computing the new sizes, Lua creates the new parts and re-inserts the elements from the old parts into the new ones` So I guess they remain contiguous after growth. Read Section 4 of the file I linked, they explain it better than I can :-) – Diego Apr 29 '15 at 13:16