0

I have created a custom DSL in xtext along with LSP support, which looks something like

grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals

generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"

Model:
    (intParameters+=IntParameter ("," intParameters+=IntParameter)*)?
    stringParameters+=StringParameter*
    elements+=Element*
    otherElements+=AnotherElement*
    ;

Element returns Element:
    'Element'
    '{'
        'name' name=StringValue
    '}';
    
ParameterElement returns ParameterElement:
    {ParameterElement} ref=([StringParameter|STRING])?
;
    
AnotherElement returns AnotherElement:
    'AnotherElement'
    '{'
        'name' name=StringValue
        'value' value=[Element]
    '}';

StringValue:
    {StringValue} ref=([StringParameter|STRING])?("+" STRING)? | value=ID
;

StringElement:
     Element | StringParameter
;

StringParameter returns StringParameter:
    {StringParameter}
    'StringParameter'
    name=ID
    '{'
        ('value' value=STRING)?
    '}';

IntValue:
    ref=[IntParameter] | value=DECINT
;

IntParameter returns IntParameter:
    {IntParameter}
    'IntParameter'
    name=ID
    '{'
        ('value' value=DECINT)?
    '}';
    
terminal fragment DIGIT: '0'..'9';
terminal DECINT: '0' | ('1'..'9' DIGIT*) | ('-''0'..'9' DIGIT*) ;

I was able to create vs code extension, where I am able to get code completion and keyword highlighting. I have implemented some basic validation in xtext, which works well in vs code as well.

Now my question is, how can I parse my DSL file? I have access to the current file and able to print the text of it

const document = vscode.window.activeTextEditor?.document;
console.log(document.gettext())

For xml files, I saw examples which used

xml2js.parseStringPromise(document.getText(), {
  mergeAttrs: true,
  explicitArray: false
}))

How can I do something like this for my custom language? Since I have used xtext with LSP support, I should be able to use the underlying parser of xtext right? I am able to get all the symbols in file with

let symbols = await vscode.commands.executeCommand ('vscode.executeDocumentSymbolProvider', uri);
console.log (symbols);

But I don't want just the symbols, but the entire text to be parsed.

Edit: I found https://github.com/tunnelvisionlabs/antlr4ts which generates parsers in typescript, which is exactly what I wanted! Only problem is that it needs g4, but xtext generetes .g (I guess this is v3). But then there is this another nice tool https://github.com/kaby76/Domemtech.Trash which converts .g2/3 to .g4

But now I get errors

line 1556:29 extraneous input '(' expecting {TOKEN_REF, LEXER_CHAR_SET, STRING_LITERAL}
line 1556:62 extraneous input '(' expecting {TOKEN_REF, LEXER_CHAR_SET, STRING_LITERAL}
line 1556:74 mismatched input ')' expecting SEMI
line 1560:25 extraneous input '(' expecting {TOKEN_REF, LEXER_CHAR_SET, STRING_LITERAL}
line 1560:36 mismatched input ')' expecting SEMI
Error in parse of /home/parser/InternalMyDsl.g4

The rules look like

1548 fragment RULE_DIGIT : '0'..'9';
1549
1550 RULE_DECINT : ('0'|'1'..'9' RULE_DIGIT*|'-' '0'..'9' RULE_DIGIT*);
1551
1552 RULE_ID : '^'? ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
1553
1554 RULE_INT : ('0'..'9')+;
1555 
1556 RULE_STRING : ('"' ('\\' .|~(('\\'|'"')))* '"'|'\'' ('\\' .|~(('\\'|'\'')))* '\'');
1557
1558 RULE_ML_COMMENT : '/*' ( . ) * ?'*/';
1559 
1560 RULE_SL_COMMENT : '//' ~(('\n'|'\r'))* ('\r'? '\n')?;
1561 
1562 RULE_WS : (' '|'\t'|'\r'|'\n')+;
1563
1564 RULE_ANY_OTHER : .;

It has been auto-generated by xtext, so not sure if I should be modifying it.

harsh
  • 905
  • 1
  • 10
  • 21
  • It seems that LSP does not expose the parser. Will I have to create a parser in typescript from scratch myself? – harsh Feb 22 '22 at 13:46
  • you may have a look into a thing that i called command in lsp https://microsoft.github.io/language-server-protocol/specifications/specification-current/#command maybe you can give some more hints on your actual usecase – Christian Dietrich Feb 22 '22 at 14:24
  • My grammar consists of fields which points to mesh files, which I want to visualize using three.js. So to extract these fields, I want to parse the model. – harsh Feb 22 '22 at 14:45
  • Based on the answer here https://stackoverflow.com/questions/38570300/simple-xtext-example-generates-grammar-that-antlr4-doesnt-like-whos-to-blame, I just removed the extra parenthesis, and I was able to use the .g file generated by xtext directly with antlr4ts. No need to convert to .g4 – harsh Feb 22 '22 at 18:09

1 Answers1

0

I found https://github.com/tunnelvisionlabs/antlr4ts which generates parsers in typescript, which is exactly what I wanted! Once I run the command

antlr4ts -visitor src/InternalKinematics.g

I got some errors

line 1556:29 extraneous input '(' expecting {TOKEN_REF, LEXER_CHAR_SET, STRING_LITERAL}
line 1556:62 extraneous input '(' expecting {TOKEN_REF, LEXER_CHAR_SET, STRING_LITERAL}
line 1556:74 mismatched input ')' expecting SEMI
line 1560:25 extraneous input '(' expecting {TOKEN_REF, LEXER_CHAR_SET, STRING_LITERAL}
line 1560:36 mismatched input ')' expecting SEMI
Error in parse of /home/parser/InternalMyDsl.g4

Based on the answer in Simple Xtext example generates grammar that Antlr4 doesn't like - who's to blame?, I just removed the extra parenthesis, and it worked!

harsh
  • 905
  • 1
  • 10
  • 21