0

I'm writing a simple lexer for a general purpose programming language and one of the token types is a 'keyword' that has some predefined control flow tokens such as 'if', 'else', 'while', 'return'.

I want to know the fastest way to check if some keyword is inside my list using x86 Standard C.

My idea was to use a jump table but C string comparisons is problematic since C strings are arrays of char type.

Matheus Lacerda
  • 5,983
  • 11
  • 29
  • 45
h0m3
  • 104
  • 9
  • 1
    create an enum? – Steephen Jun 05 '18 at 21:57
  • @h0m3 There is no such this as "x86 Standard C (gcc)". There is standard C and GCC which allows you to write code using standard C for x86 architectures. And to answer your question, the easiest way is to keep a sorted static array of keywords and do a binary search. For trivial things such as this, that's the easiest and fastest way. – Unmanned Player Jun 05 '18 at 22:33
  • @UnmannedPlayer Yes. That is what i meant. (x86) (Standard C) (gcc). Because even using Standards the compiler and architecture can influenciate on performance. – h0m3 Jun 05 '18 at 22:50

2 Answers2

1

The fastest way is to hand-build a trie, or equivalently a state machine. Flex (or any other lex variant) would do that for you.

rici
  • 234,347
  • 28
  • 237
  • 341
1

Theoretically a hash table provides lookup of O(1). However, I would implement a static look up table. Assuming the number of tokens you are searching for is small. A linear search of the table shouldn’t prove too costly.