I just had a look in efficient table based Huffman decoding algorithms. A short description can be found here: https://stackoverflow.com/a/29902248/1421332 (using two-level tables). Main goal is to reduce decoding time by spending a limited amount of memory. I think I understood how these algorithms work.
What I don't understand: Every real bit stream is finite. This means at some point (at the end of the stream) it will not have 9bit left (which are needed as index for the LUT in case of a 9bit LUT) but possibly only a few bits for a shorter code. Using these bits as an index will not work. I could check the number of remaining bits every time I get bits from the stream and append enough 0-bits to always get a full index. However this adds a comparison operation to each code look-up which is a waste since it only has an effect for the last few bits of a potentially large stream.
Is there a more efficient way to handle that issue?