0

I am working on a netbeans module to parse html markups by following this tutorial.

For the keyword html, I wrote following javacc file.

options {
  JAVA_UNICODE_ESCAPE = true;
  ERROR_REPORTING = false;
  STATIC = false;
  COMMON_TOKEN_ACTION = false;
  TOKEN_FACTORY = "Token";
  JDK_VERSION = "1.8";
  BUILD_PARSER = false;
}

PARSER_BEGIN(HTMLParser)

package org.html.jcclexer;

import java.io.*;

/**
 * Grammar to parse Java version 1.5
 * @author Sreenivasa Viswanadha - Simplified and enhanced for 1.5
 */

public class HTMLParser {}

PARSER_END(HTMLParser)

/* WHITE SPACE */

TOKEN :
{
  < WHITESPACE:
  " "
| "\t"
| "\n"
| "\r"
| "\f">
}

TOKEN   : { < HTML : "html" > }

It colors my html word perfectly but then it gives error:

java.lang.IllegalArgumentException: Token id must not be null. Fix lexer org.html.lexer.HTMLexer@1e6bbd25

test.html contains following word only:

html

I am not sure it's error due to my .jj file or something else.

Volatil3
  • 14,253
  • 38
  • 134
  • 263
  • I don't think it has anything to do with the JavaCC end of things. The tokens that come out of JavCC don't have an `id` attribute. I think the problem is more likely with the token objects of type `org.netbeans.api.lexer.Token` since these tokens do have an `id`. One thing that would help you answer the question is if you could take a look at the stack trace for that exception. If you can get that, you can see which method is getting a bad argument and where it is getting it from. Take a look at the `getToken` method in your extension of `LanguageHierarchy`. I bet it can return `null`. – Theodore Norvell Mar 28 '15 at 20:17
  • @TheodoreNorvell Yes Sir you were right. It was fetching null between `html` and `EOF` token. I am not sure it's a norm or something related to my grammer. All I did that I caught the exception and it did not popup error window anymore. Did I do right? Will it let me continue to next token? – Volatil3 Mar 28 '15 at 21:26
  • @TheodoreNorvell it now gives error ` Chars: "\n\n" - these characters need to be tokenized.` – Volatil3 Mar 28 '15 at 21:30
  • 2
    I think that your latest problem also does not have to do with JavaCC. I would suggest you add a rule at the end of your .jj file that says `TOKEN { }` that will turn all characters that aren't part of some other token into tokens of kind `OTHER`. This rule must remain the final rule; i.e. add all new rules above it. If you do that, your lexer should never throw any errors. All remaining problems will be to do with netbeans, not JavaCC. Unfortunately I can't help you with netbeans. – Theodore Norvell Mar 29 '15 at 00:02

0 Answers0