9

I have an ANTLR grammar and I would like to fuzz my parser.

bcat
  • 8,833
  • 3
  • 35
  • 41
Jerome B
  • 91
  • 1

3 Answers3

1

Are you looking for generation from a CFG grammar? Ie. the generation of strings that are accepted by the grammar? This could be a good idea to check for grammar correctness, but keep in mind that the set of accepted strings is most probably infinite. Any really bad bugs should already be apparent in the grammar specification, and hopefully by the checking of LL-ness.

I dont know of any tool in the ANTLR world, neither did a quick google search on (E)BNF generation reveal anything useful.

It is, however, not very difficult to roll your own generator if performance and such is not an issue. Prolog would spring to mind, there are loads of litterature available, but if you do not want to leave Java, i suspect homebrewing is the way to go. Its fun anyway.

0

Assume you generated sentences (strings of tokens) from your ANTLR grammar. Why do you think your ANTLR-based parser would object to them?

What you really have to do is to produce not-quite-legal strings. So, what you need is a generator that can produce erroneous strings.

Given that ANTLR generates a set of procedures from your ANTLR grammar, I think it would be difficult to produce a sentence-generator using the generated parser. What you need is the explicit model of the grammar. And this already available to you: the ANTLR input grammar.

An additional complication I see is generation of legal tokens from the regexes that make up the token definitions. Again, you'd need to process the ANTLR input to do this.

Processing both of these seem technically straightforward. The best engine to use as a foundation is likely the ANTLR front end, which obviously parses ANTLR specs, and so must hold some representation of the ANTLR input.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • I dont believe you are correct, as an ANTLR user I'm not interesting in testing ANTLRs ability to use a grammar to parse an input sequence, its abilities and behaviour are ~well documented. No, the thing I'm interested in testing is my visitor-listener's behaviour for arbitrary object graphs --graphs I couldn't think to come up with the documents for myself. I would very much like a tool that generates random strings for a supplied grammar. Such a tool would allow for elegant generation of end-to-end tests. – Groostav Jan 30 '16 at 00:52
  • 1
    @Groostav: everybody understand that the problem was to generate random strings from the language, or from close variants. The problem is, how are you going to generate those strings? My point is that cannot easily do it from generated ANTLR parsers; that's like trying to interpret arbitrary code which is Turing-hard (maybe its not, but nobody is making the argument that this is easy to do). The only artifact you have to use to generate those strings is the explicit ANTLR grammar itself, including the lexical aspects. ANTLR itself doesn't appear to provide you any help. What will? – Ira Baxter Jan 30 '16 at 01:44
0

Was looking for something similar and found GramTest, which seems to be suitable, but instead of ANTLR grammar uses BNF grammar as input.

This tool allows you to generate test cases based on arbitrary user defined grammars. The input grammar is given in BNF notation. Potential applications include fuzzing and automated testing.

For more background info they link to the following blogposts:

spoorcc
  • 2,907
  • 2
  • 21
  • 29