7

I need to quickly build a parser for a very simplified version of a html-like markup language in Java. In python, I would use pyparsing library to do this. Is there something similar for Java? Please, don't suggest libraries already out there for html parsing, my application is a school assignment which will demonstrate walking a tree of objects and serializing to text using visitor pattern, so I'm not thinking in real world terms here. Basically all I need here is tags, attributes and text nodes.

tzot
  • 92,761
  • 29
  • 141
  • 204
VoY
  • 5,479
  • 2
  • 37
  • 45

5 Answers5

8

Another good parser generator is ANTLR, that might be what you're looking for.

Jorn
  • 20,612
  • 18
  • 79
  • 126
3

May be overkill for your use, but javacc is an excellent industrial-strength parser generator. I've used this program/library several times, its reliable and worth learning, particularly if you are going to work with languages and compilers. Here's the description of the program from the website listed above:

Java Compiler Compiler [tm] (JavaCC [tm]) is the most popular parser generator for use with Java [tm] applications. A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar. In addition to the parser generator itself, JavaCC provides other standard capabilities related to parser generation such as tree building (via a tool called JJTree included with JavaCC), actions, debugging, etc.

codefin
  • 304
  • 1
  • 5
3

A quick search for parser generators in Java yields JParsec. I've never used it - but it's inspired by a Haskell library, so by definition it must be good:-)

Torsten Marek
  • 83,780
  • 21
  • 91
  • 98
  • Looks like very interesting, departing from the code generators... Thanks for the reference. – PhiLho Nov 29 '08 at 19:05
2

I like JParsec (which I just discovered thanks to Torsten) because it doesn't generate code... :-) Perhaps less efficient, but enough for small tasks.
I found a similar library, JTopas.

There is a good list of parser (generators or not) at Java Source.

PhiLho
  • 40,535
  • 6
  • 96
  • 134
1

There are quite a number choices for stringhandling in java. Maybe the very basic java.util.Scanner and java.util.StringTokenizer Classes are helpfull for you?

Another good choice is maybe the org.apache.commons.lang.text library. http://commons.apache.org/lang/apidocs/org/apache/commons/lang/text/package-summary.html