0

I wrote a program in ocaml that given an infix expression like 1 + 2, outputs the prefix notation : + 1 2

My problem is I don't find a way to make a rules like : all value, operator and bracket should be always separated by at least one space: 1+ 1 would be wrong 1 + 1 ok. I would like to not use the ocamlp4 grammar.

here is the code:

open Genlex                                                                                                                                                               

type tree =
  | Leaf of string
  | Node of tree * string * tree

let my_lexer str =
  let kwds = ["("; ")"; "+"; "-"; "*"; "/"] in
    make_lexer kwds (Stream.of_string str)

let make_tree_from_stream stream =
  let op_parser operator_l higher_perm =
    let rec aux left higher_perm = parser
        [<'Kwd op when List.mem op operator_l; right = higher_perm; s >]
        -> aux (Node (left, op, right)) higher_perm s
      | [< >]
        -> left
    in
      parser [< left = higher_perm; s >]        -> aux left higher_perm s
  in
  let rec high_perm l = op_parser ["*"; "/"] brackets l
  and low_perm l = op_parser ["+"; "-"] high_perm l
  and brackets = parser
    | [< 'Kwd "("; e = low_perm; 'Kwd ")" >]    -> e
    | [< 'Ident n >]                            -> Leaf n
    | [< 'Int n >]                              -> Leaf (string_of_int n)
  in
    low_perm stream

let rec draw_tree = function
  | Leaf n              -> Printf.printf "%s" n
  | Node(fg, r, fd)     -> Printf.printf "(%s " (r);
      draw_tree fg;
      Printf.printf " ";
      draw_tree fd;
      Printf.printf ")"

let () =
  let line = read_line() in
    draw_tree (make_tree_from_stream (my_lexer line)); Printf.printf "\n"

Plus if you have some tips about the code or if you notice some error of prog style then I will appreciate that you let it me know. Thanks !

axzwl
  • 23
  • 4

1 Answers1

1

The Genlex provides a ready-made lexer that respects OCaml's lexical convention, and in particular ignore the spaces in the positions you mention. I don't think you can implement what you want on top of it (it is not designed as a flexible solution, but a quick way to get a prototype working).

If you want to keep writing stream parsers, you could write your own lexer for it: define a token type, and lex a char Stream.t into a token Stream.t, which you can then parse as you wish. Otherwise, if you don't want to use Camlp4, you may want to try an LR parser generator, such as menhir (a better ocamlyacc).

gasche
  • 31,259
  • 3
  • 78
  • 100
  • Well I don't want to use ocamlp4 grammar because it's really scary. I'm really new with the ocaml's syntax/philosophy. When you say you could write your own lexer, you mean I should not use make_lexer and write my own function ? Thanks for helping ! – axzwl May 06 '13 at 05:13
  • Yes, if you want a fine-grained handling of whitespace, you should write your own lexer function. Genlex is only one particular instance of such a function. But have you considered using a parser-generator instead of writing one by hand? – gasche May 06 '13 at 06:27
  • Yes I have considered it but I wanted to feel confortable with ocamlp4's streams so I just wrote my own parser and I had in mind that it should be a good training. – axzwl May 06 '13 at 15:59