4

I'm putting together a lexer/parser for a simple programming language using a Prolog DCG that builds up the list of tokens/syntax tree using DCG arguments, e.g.

symbol(semicolon) --> ";".
symbol(if) --> "if".

and then the syntax tree is built using those arguments to the DCG rules.

However, I've hit a bump in that when it gets to parsing variables and numbers (only integers in this language), I need the DCG arguments to be more dynamic, e.g.

symbol(number(X)) --> X, {integer(N)}.

Essentially, I need the DCG argument to essentially be generated from what it's actually parsing. Is there a way to do this? If not, what could be a good workaround?

EDIT: As a specific example, I've got the rule

symbol(num(N)) --> {number_codes(N,C)}, C.

and I need the output N=7 when querying phrase(symbol(num(N)),"7").

  • 1
    It is definitely possible; have you seen [dcg/basics](http://www.swi-prolog.org/pldoc/doc/_SWI_/library/dcg/basics.pl)? (Also, the source code for it [is pretty instructive too](http://www.swi-prolog.org/pldoc/doc/_SWI_/library/dcg/basics.pl?show=src#integer/3).) – Daniel Lyons Nov 07 '17 at 15:33
  • 1
    You're close. `symbol(number(X)) --> [X], {integer(X)}.` Although you want to be careful since `number/1` is a standard Prolog predicate. Perhaps pick something else. – lurker Nov 07 '17 at 15:45
  • I'm not so sure though; the specific example I'm facing at the moment is: `symbol(num(N)) --> {number_codes(N,C)}, C`. I essentially need the output `N=7` when inputting `phrase(symbol(num(N)),"7"). – user2396812 Nov 07 '17 at 16:02

1 Answers1

1

I see three problems here.

  1. phrase/2 wants to operate on lists of codes. Since version 7, SWI has a native string type that does not support DCGs. So, you must now resort to this slightly inconvenient formulation:

    atom_codes("if", Codes), phrase(symbol(X), Codes)
    
  2. In general, you want to peel off something from the input and then hand it to some pure-Prolog predicate to do something. In other words, something like this:

    symbol(num(N)) --> [C], { number_codes(N, [C]) }.
    
    ?- atom_codes(9, X), phrase(symbol(S), X).
    X = [57],
    S = num(9).
    

    This will, of course, only work for single-digit numbers, which probably isn't what you want, so...

  3. You should probably use the code from dcg/basics.pl like this:

    :- use_module(library(dcg/basics)).
    
    symbol(num(N)) --> integer(N).
    
    ?- atom_codes(973, X), phrase(symbol(S), X).
    X = [57, 55, 51],
    S = num(973).
    

    Or you could do a copy/paste thing using the source code. You'll probably notice that all the DCG rules in there either start with a call to another DCG rule or they consume some of the input and then do something else; you probably don't want to generate something and then look for it in the input.

Daniel Lyons
  • 22,421
  • 2
  • 50
  • 77