I am trying to build a symbol table from my grammar (done with antlr) by using eclipse. However I don't know by what to begin. I think I read somewhere that you would need the parser and lexer generated by antlr to do that. Does someone know an easy example so that I can understand how it works please ?
Asked
Active
Viewed 6,575 times
4
-
1Go read any standard text about compiler symbol tables. After that, its just (a bunch of) sweat. ANTLR offers you nothing specific to help; it is a *parser generator*, and does a really good job there, and then it stops. – Ira Baxter Mar 23 '13 at 20:27
-
Yes, I am currently checking them out. Seem like you can do a getTree to the lexer generated by antlr, and work on it. – Exia0890 Mar 27 '13 at 21:15
1 Answers
8
A symbol table is just a versioned map of id's to values. This is one solution, using a push and pop of scopes as the versioning mechanism -- push a scope on entry of a scope defining rule and pop on exit.
package net.certiv.metal.symbol;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.Map;
import net.certiv.metal.types.ScopeType;
import net.certiv.metal.util.Strings;
public class Scope {
public final int genId;
public ScopeType type;
public Scope enclosingScope;
protected Map<String, Symbol> symbolMap = new LinkedHashMap<String, Symbol>();
public Scope(ScopeType type, final int genId, Scope enclosingScope) {
this.type = type;
this.genId = genId;
this.enclosingScope = enclosingScope;
}
/**
* Define a new variable in the current scope
* This is the entry point for adding new variables
*/
public void define(String name, ArrayList<String> parameters) {
String params = Strings.asString(parameters, true, ".");
Symbol symbol = new Symbol(null, name + params, null);
define(symbol);
}
/** Define a symbol in the current scope */
private void define(Symbol symbol) {
symbol.setScope(this);
symbolMap.put(symbol.name, symbol);
}
/**
* Look up the symbol name in this scope and, if not found,
* progressively search the enclosing scopes.
* Return null if not found in any applicable scope.
*/
private Symbol resolve(String name) {
Symbol symbol = symbolMap.get(name);
if (symbol != null) return symbol;
if (enclosingScope != null) return enclosingScope.resolve(name);
return null; // not found
}
/**
* Lookup a variable starting in the current scope.
* This is the entry point for lookups
*/
public Symbol resolve(String name, ArrayList<String> parameters) {
String params = Strings.asString(parameters, true, ".");
return resolve(name + params);
}
/** Where to look next for symbols */
public Scope enclosingScope() {
return enclosingScope;
}
public String toString() {
return symbolMap.keySet().toString();
}
}
package net.certiv.metal.types;
public enum ScopeType {
GLOBAL,
LOCAL;
}
package net.certiv.metal.symbol;
import net.certiv.metal.converter.BaseDescriptor;
import net.certiv.metal.types.ValueType;
public class Symbol {
protected Scope scope; // the owning scope
protected BaseDescriptor descriptor;
protected String name;
protected ValueType type;
public Symbol(BaseDescriptor descriptor, String name, ValueType type) {
this.descriptor = descriptor;
this.name = name;
this.type = type;
}
public BaseDescriptor getDescriptor() {
return descriptor;
}
public String getName() {
return name;
}
public ValueType getType() {
return type;
}
public void setScope(Scope scope) {
this.scope = scope;
}
public Scope getScope() {
return scope;
}
public int genId() {
return scope.genId;
}
public String toString() {
if (type != null) return '<' + getName() + ":" + type + '>';
return getName();
}
}
package net.certiv.metal.symbol;
import java.util.ArrayList;
import java.util.Stack;
import net.certiv.metal.types.ScopeType;
import net.certiv.metal.util.Log;
public class SymbolTable {
protected Stack<Scope> scopeStack;
protected ArrayList<Scope> allScopes;
protected int genId;
public SymbolTable() {
init();
}
protected void init() {
scopeStack = new Stack<>();
allScopes = new ArrayList<>();
genId = 0;
Scope globals = new Scope(ScopeType.GLOBAL, nextGenId(), null);
scopeStack.push(globals);
allScopes.add(globals);
}
public Scope pushScope() {
Scope enclosingScope = scopeStack.peek();
Scope scope = new Scope(ScopeType.LOCAL, nextGenId(), enclosingScope);
scopeStack.push(scope);
allScopes.add(scope);
return scope;
}
public void popScope() {
scopeStack.pop();
}
public Scope currentScope() {
if (scopeStack.size() > 0) {
return scopeStack.peek();
}
Log.error(this, "Unbalanced scope stack.");
return allScopes.get(0);
}
public Scope getScope(int genId) {
for (Scope scope : scopeStack) {
if (scope.genId == genId) return scope;
}
return null;
}
private int nextGenId() {
genId++;
return genId;
}
public String toString() {
StringBuilder sb = new StringBuilder();
for (Scope scope : scopeStack.subList(0, scopeStack.size() - 1)) {
sb.append(scope.toString());
}
return sb.toString();
}
}

GRosenberg
- 5,843
- 2
- 19
- 23
-
1A symbol table is really a map of every identifier instance to a descriptor of the type instance for that identifier. Scopes are a way to organize how symbols in one part of the code map to known symbol declarations in the scope. This answer is right in a simple way: many older languages could be processed by a simple stack of scopes. More modern languages, with namespaces and generics, aren't nearly as simple, although the map idea remains. – Ira Baxter Mar 24 '13 at 19:40
-
Not sure if you read too quickly or what, but the answer is quite correct as presented. The code is is given as "one solution" to illustrate the mechanisms of working with a symbol table, which is in direct answer to the OP's question. Any existential qualifications can be implemented on the mapping of ids to values, but that was not an element of the OP's question. Besides, there are languages that are typeless/singly typed, which from the OP's grammar, may be the case for the OP. Namespaces and generics, both handled by forms of versioning, are not implicated in the OP's grammar. – GRosenberg Mar 24 '13 at 20:59
-
1I wasn't complaining about your solution as a sketch (note: no ding!), just observing that building symbol tables for real langauges will likely require more than this. Maybe his language is simple. He better decide before he starts implementing, which was my original point about reading about symbol tables in detail before rushing off to do something. – Ira Baxter Mar 24 '13 at 21:50
-
Sorry for the late reply, thanks both of you for your answers. @GRosenberg thanks for your code, I am still currently working on it so that I can understand it, and see if it would work with my grammar. However there are a few classes that are imported that I could not see from where it would come from :( util.Strings; util.Log; ValueType; ). – Exia0890 Mar 27 '13 at 21:12
-
Strings.asString applies a rule (whatever you want) to normalize the presentation of the var name. For example, converting a complex variable naming scheme to a simpler property (a.b.c) encoded string representation with appropriate uniqueness. ValueType can be a simple enum identifying the formal type of the var. BaseDescriptor can be a class holding whatever extended data you want to associate with a var. – GRosenberg Mar 29 '13 at 04:44