2

I'm adding some code to the ws2812 module to be able to have some kind of reusable buffer where we could store led values.

The current version is there.

I've two problems.

First I wanted to have some "OO-style" interface. So I did:

local buffer = ws2812.newBuffer(300);
for j = 0,299 do
   buffer:set(j, 255, 255, 255)
end
buffer:write(pin);

The probleme here is that buffer:set is resolved at each loop turn, which is costly (this loop takes ~20.2ms):

8       [2]     FORPREP         1 6     ; to 15
9       [3]     SELF            5 0 -7  ; "set"
10      [3]     MOVE            7 4
11      [3]     LOADK           8 -8    ; 255
12      [3]     LOADK           9 -8    ; 255
13      [3]     LOADK           10 -8   ; 255
14      [3]     CALL            5 6 1
15      [2]     FORLOOP         1 -7    ; to 9

I found a workaround for this problem which doesn't look "nice":

local buffer = ws2812.newBuffer(300);
local set = getmetatable(buffer).set;
for j = 0,299 do
   set(buffer, j, 255, 255, 255)
end
buffer:write(pin);

It works well (4.3ms for the loop, more than 4 times faster), but it's more like a hack. :/ Is there a better way to "cache" the buffer:set resolution?

Second question, in my C code, I use:

ws2812_buffer * buffer = (ws2812_buffer*)luaL_checkudata(L, 1, "ws2812.buffer");

Which gives back my buffer ptr and check if it is really a ws2812.buffer. But this call is sloooooow: on my ESP8266, ~50us. If it's done on each call (for my 300 time buffer:set for example), it's ~15ms!

Is there a better way to fetch some user data and check its type, or should I add some "canary" at the beginning of my structure to do my own check (which will almost be "free" compared to 50us...)?

mpromonet
  • 11,326
  • 43
  • 62
  • 91
Alkorin
  • 21
  • 2
  • Please report how much faster the second version is compared to the first. – lhf Dec 26 '15 at 16:04
  • Added in post, 4.3ms instead of 20.2ms ! – Alkorin Dec 27 '15 at 20:23
  • The best way is to vectorize the API function e.g. `setvec(jstart, jend, rstart, rend, bstart, gend, bstart, bend)`, then you've got **one** C library call: `buffer:set(0, 299, 255, 255, 255, 255, 255, 255)` and the overhead of the one table lookup is largely irrelevant. – TerryE Dec 28 '15 at 00:38
  • Yes and no @TerryE, I can't predict the next color: it's mainly from a lookup table and/or some math function. That's why I search to optimise the simple call ;) But for some cases, I've planned to handle table of tables to be able to reuse some stored patterns: `set(start, {{g, r, b},{g, r, b},...})` – Alkorin Dec 28 '15 at 18:38

1 Answers1

3

To make it look less of a hack you could try using

local set = buffer.set

This is essentially the same code, but without the getmetatable as the metatable is used implicitly through the __index metamethod.

On our project we made our own implementation of luaL_checkudata. One option - as you similarly suggested - was to use a wrapper object that holds the type. As all userdata was assumed to be wrapped in the wrapper we could use it to get and confirm the type of the userdata. But there was no benchmarking done and testing metatables was used instead.

I would say testing the metatables is slower than the wrapping since luaL_checkudata does a lot of work to get and test the metatables and with wrapping we have access to the type directly. However benchmarking will tell for sure.

Rochet2
  • 1,146
  • 1
  • 8
  • 13
  • Indeed it works... dunno why I was stuck with the colon syntax :/ Thanks ! – Alkorin Dec 27 '15 at 20:25
  • @Alkorin I made some **very** poor testing and it clearly shows speed up when using a wrapper instead of the metatable checking. Here is the code I messed around with: https://gist.github.com/Rochet2/c61cc0ecb05ecbea2291 – Rochet2 Dec 27 '15 at 22:21