Statically Typing a Scripting Language in Java

Question

I'm building a scripting language in Java for a game, and I'm currently working on the parser. The language is to be utilized by players/modders/myself to create custom spells and effects. However, I'm having difficulty imagining how to smoothly implement static typing in the current system (a painful necessity driven by performance needs). I don't care so much if compilation is fast, but actual execution needs to be as fast as I can get it (within reason, at least. I'm hoping to get this done pretty soon.)

So the parser has next() and peek() methods to iterate through the stream of tokens. It's currently built of a hierarchy methods that call each other in a fashion that preserves type precedence (the "bottom-most" method returning a constant, variable, etc). Each method returns an IResolve that has a generic type <T> it "resolves" to. For example, here's a method that handles "or" expressions, with "and" being more tightly coupled:

protected final IResolve checkGrammar_Or() throws ParseException
{
    IResolve left = checkGrammar_And();

    if (left == null)
        return null;

    if (peek().type != TokenType.IDENTIFIER || !"or".equals((String)peek().value))
        return left;

    next();

    IResolve right = checkGrammar_Or();

    if (right == null)
        throwExpressionException();

    return new BinaryOperation(left, right, new LogicOr());
}

The problem is when I need to implement a function that depends on the type. As you probably noticed, the generic type isn't being specified by the parser, and is part of the design problem. In this function, I was hoping to do something like the following (though this wouldn't work due to generic types' erasure...)

protected final IResolve checkGrammar_Comparison() throws ParseException
{
    IResolve left = checkGrammer_Term();

    if (left == null)
        return null;

    IBinaryOperationType op;

    switch (peek().type)
    {
    default:
        return left;

    case LOGIC_LT:

        //This ain't gonna work because of erasure
        if (left instanceof IResolve<Double>)
            op = new LogicLessThanDouble();

        break;

    //And the same for these
    case LOGIC_LT_OR_EQUAL:
    case LOGIC_GT:
    case LOGIC_GT_OR_EQUAL:
    }

    next();

    IResolve right = checkGrammar_Comparison();

    if (right == null)
        throwExpressionException();

    return new BinaryOperation(left, right, op);
}

The problem spot, where I'm wishing I could make the connection, is in the switch statement. I'm already certain I'll need to make IResolve non-generic and give it a "getType()" method that returns an int or something, especially if I want to support user-defined classes in the future.

The question is:

What's the best way to achieve static typing given my current structure and the desire for mixed inheritance (user-defined classes and interfaces, like Java and C#)? If there is no good way, how can I alter or even rebuild my structure to achieve it?

Note: I don't claim to have any idea what I've gotten myself into, constructive criticism is more than welcome. If I need to clarify anything, let me know!

Another note: I know you're thinking "Why static typing?", and normally I'd agree with you-- however, the game world is composed of voxels (it's a Minecraft mod to be precise) and working with them needs to be fast. Imagine a script that's a O(n^2) algorithm iterating over 100 blocks twenty times a second, for 30+ players on a cheap server that's already barely squeaking by... or, a single, massive explosion effecting thousands of blocks, inevitably causing a horrendous lag spike. Hence, backend type checking or any form of duck-typing ain't gonna cut it (though I'm desperately aching for it atm.) The low level benefits are a necessity in this particular case, painful though it is.

Good lord, it feels like rewriting Java itself. Why would you do this to yourself? I could see wanting a game DSL, but redoing all that object-oriented stuff with interfaces, classes, etc.? I don't see why. — duffymo, Jun 26 '12 at 04:37
Well, the decision was motivated by the desire for security (the existing user base consists of hundreds of thousands of kids), flexibility, ease of use, and uh, me having fun. I looked into just using Javascript, but there didn't seem to be a way to limit the script's ability... The object-oriented-ness could be spared, really, though it'd be nice. The static typing is the one thing giving me grief, and I need it for efficiency's sake (the gameworld works with voxels.) — Philip Guin, Jun 26 '12 at 04:43
I think static typing is overrated. JavaScript does just fine without it; so do other languages. I wouldn't want to invent a new language, unless you're having fun and just want to give it a try. — duffymo, Jun 26 '12 at 04:47
Any reason you are hand writing a parser, rather than using something like Antlr? — xxpor, Jun 26 '12 at 04:55
Ehh several. The game is a popular Minecraft mod, so including libraries and such makes installation difficult (the clientele consists of children, and installation == literally altering jar files (though legally in this case). Hard enough for grownups to do). Also, I had a lot of random questions I felt couldn't be answered without spending loads of time shifting through documentation and messing with Antlr and such, and I already had a *dynamically* typed interpreted language in something other than Java, so I just thought "why not port that over?" And wallah, problem formed. — Philip Guin, Jun 26 '12 at 05:06
I think you are going wrong way. More and more often dynamic typing and its benefits are discussed. We have Groovy and other lightweight scripting languages tightly integrated with Java and IDEs. Mb it's better to take your time to build some toolkit for your game engine rather than reinvent the things other world already has in place? — Viktor Stolbin, Jun 26 '12 at 05:09
@ViktorStolbin No one's gonna be building entire programs in the scripting language, only algorithms iterating through potentially thousands of voxels per twentieth of a second. Efficiency is of utmost importance in this particular case, though I'd normally agree with you. — Philip Guin, Jun 26 '12 at 05:27
I'd suggest to choose any existing scripting language that has mature IDE, eventually you will need a debug, and code sugar to make your iteration algorithms painless. Don't want to be annoying but take a look at [Groovy](http://groovy.codehaus.org/Looping) — Viktor Stolbin, Jun 26 '12 at 05:40
Parsing has nothing to do with typing! Do not ever mix the compilation stages, it is a very bad idea. — SK-logic, Jun 29 '12 at 07:59
As for implementing the typing: such a trivial language would be just fine with a simple *type propagation*. I cannot think of a simpler way to do the typing. Any decent dynamic type system would be more complicated. — SK-logic, Jun 29 '12 at 08:02

score 1 · Answer 1 · answered Jun 26 '12 at 05:25

1

You can get the best of both worlds by adding a method Class<T> getType() to IResolve; its implementers should simply return the appropriate Class object. (If the implementers themselves are generic, you need to get a reference to that object in the constructor or something.)

You can then do left.getType().equals(Double.class), etc.

This is entirely separate from the question of whether you should build your own parser with static typing, which is very much worth asking.

answered Jun 26 '12 at 05:25

Taymon

24,950
9
62
84

I appreciate the answer :) It wouldn't support an inheritance system (unless you could instantiate classes via reflection, which I'm pretty sure Java disallows), but it'll work if that solution happens to be too nutty to implement. I understand the cynicism, but I gotta have the static typing. As for parser generation, I haven't been able to find one that supports aforementioned static typing, along with no external libraries and such (unless I don't know what I'm looking for, which is always a possibility) – Philip Guin Jun 26 '12 at 06:10
You wouldn't normally put the type checker in the parser. The parser generates an abstract syntax tree; you then check the AST for type correctness with a type-checking algorithm. (The only such algorithm I'm familiar with is [Hindley–Milner](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner); I don't know what Java uses.) – Taymon Jun 26 '12 at 18:17
Also, just to be clear, by "implementers" I meant "your classes that implement the `IResolve` interface". And Java [does, in fact, allow you to instantiate classes via reflection](http://docs.oracle.com/javase/6/docs/api/java/lang/Class.html#newInstance%28%29). – Taymon Jun 26 '12 at 18:21
Er, let me rephrase: Java allows you to instantiate *instances* of classes via reflection, but not create actual classes themselves, as far as I've been able to discern. – Philip Guin Jun 26 '12 at 19:24

score 0 · Accepted Answer · answered Jul 05 '12 at 21:11

The solution I'm going with, as some have suggested in the comments, was to separate parsing and typing into separate phases, along with using an enum to represent type as I originally felt I should.

While I appreciate Taymon's answer, I can't use it if I hope to support user defined classes in the future.

If someone has a better solution, I'd be more than happy to accept it!

Statically Typing a Scripting Language in Java

The question is:

2 Answers2