Questions tagged [lexer]

A program converting a sequence of characters into a sequence of tokens

A lexer is a program whose purpose is the conversion of a sequence of characters into a sequence of tokens. It is also often referred to as a scanner. A lexer often exists as a single function, which is called by a parser or another function.

1050 questions

votes

5 answers

Is the word "lexer" a synonym for the word "parser"?

The title is the question: Are the words "lexer" and "parser" synonyms, or are they different? It seems that Wikipedia uses the words interchangeably, but English is not my native language so I can't be sure.

asked May 09 '11 at 18:46

Seth Carnegie

73,875
22
181
249

votes

3 answers

How do I get an Antlr Parser rule to read from both default AND hidden channel

I use the normal whitespace separation into the hidden channel but I have one rule where I would like to include any whitespace for later processing but any example I have found requires some very strange manual coding. Is there no easy option to…

antlr antlr3 lexer parser-generator

asked Apr 21 '11 at 08:59

David Mårtensson

7,550
4
31
47

votes

3 answers

Standard format for concrete and abstract syntax trees

I have an idea for a hobby project which performs some code analysis and manipulation. This project will require both the concrete and abstract syntax trees of a given source file. Additionally, bi-directional references between the two trees would…

parsing grammar lexer abstract-syntax-tree

asked Feb 17 '09 at 09:31

Brandon Bloom

1,301
10
26

votes

2 answers

attribute references not allowed in lexer actions

I found a simple grammar to start learning ANTLR. I put it in the myGrammar.g file. here is the grammar: grammar myGrammar; /* This will be the entry point of our parser. */ eval : additionExp ; /* Addition and subtraction have the…

java parsing antlr lexer

asked Mar 26 '17 at 15:10

Ali Salehi

votes

2 answers

Unable to compile output of lex

When I attempt to compile the output of this trivial lex program: # lex.l integer printf("found keyword INT"); using: $ gcc lex.yy.c I get: Undefined symbols: "_yywrap", referenced from: _yylex in ccMsRtp7.o _input in ccMsRtp7.o …

gcc lex lexer

asked Apr 10 '10 at 05:40

dstnbrkr

4,305
22
23

votes

4 answers

lexers / parsers for (un) structured text documents

There are lots of parsers and lexers for scripts (i.e. structured computer languages). But I'm looking for one which can break a (almost) non-structured text document into larger sections e.g. chapters, paragraphs, etc. It's relatively easy for a…

parsing document lexer

asked Jan 18 '10 at 16:57

wilson32

votes

2 answers

Is C++ code generation in ANTLR 3.2 ready?

I was trying hard to make ANTLR 3.2 generate parser/lexer in C++. It was fruitless. Things went well with Java & C though. I was using this tutorial to get started: http://www.ibm.com/developerworks/aix/library/au-c_plusplus_antlr/index.html When I…

c++ code-generation parsing lexer antlr3

asked Dec 02 '09 at 08:46

Viet

17,944
33
103
135

votes

1 answer

How do you write a lexer parser where identifiers may begin with keywords?

Suppose you have a language where identifiers might begin with keywords. For example, suppose "case" is a keyword, but "caser" is a valid identifier. Suppose also that the lexer rules can only handle regular expressions. Then it seems that I…

parsing keyword lexer dfa

asked May 03 '13 at 11:28

BenRI

votes

1 answer

ANTLR: Space indentation?

I want to create a very simple grammar with space indentation. Each line consists of 1 or more words but indentation like python (4 spaces or a tab is one indent) and there is no close for indentation, for example: if something cool occurs do…

java antlr lexer

asked Aug 30 '12 at 21:17

Elliot Chance

5,526
10
49
80

votes

1 answer

most efficient way to parse this scripting language

I'm implementing an interpreter for a long-outdated text editor's scripting language, and I'm having some trouble getting a lexer to work properly. Here's an example of the problematic part of the language: T L /LOCATE ME/ C /LOCATE ME/CHANGED ME/ *…

python lexer shlex

asked Jul 19 '12 at 16:51

Robbie Rosati

1,205
1
9
23

votes

3 answers

Determining "Mood" of Textual Phrases through Lexical Analysis

I am looking to apply scores (positive, negative or neutral) to short phrases of text. Short of parsing out emoticons and making assumptions based on their usage, I'm unsure of what else to try. Can anyone provide examples, research papers,…

parsing text lexer

asked Jun 15 '09 at 15:46

Michael Wales

10,360
8
28
28

votes

2 answers

How can I simplify token prediction DFA?

Lexer DFA results in "code too large" error I'm trying to parse Java Server Pages using ANTLR 3. Java has a limit of 64k for the byte code of a single method, and I keep running into a "code too large" error when compiling the Java source generated…

java antlr antlr3 lexer dfa

asked Sep 22 '11 at 15:34

erickson

265,237
58
395
493

votes

2 answers

Lexer that recognizes indented blocks

I want to write a compiler for a language that denotes program blocks with white spaces, like in Python. I prefer to do this in Python, but C++ is also an option. Is there an open-source lexer that can help me do this easily, for example by…

python compiler-construction whitespace lexer

asked Aug 01 '11 at 19:28

Elektito

3,863
8
42
72

votes

2 answers

Using Alex in Haskell to make a lexer that parses Dice Rolls

I'm making a parser for a DSL in Haskell using Alex + Happy. My DSL uses dice rolls as part of the possible expressions. Sometimes I have an expression that I want to parse that looks like: [some code...] 3D6 [... rest of the code] Which should…

parsing haskell dsl lexer alex

asked Jul 13 '20 at 04:48

Zeb

votes

4 answers

Recursive Descent Parser for something simple?

I'm writing a parser for a templating language which compiles into JS (if that's relevant). I started out with a few simple regexes, which seemed to work, but regexes are very fragile, so I decided to write a parser instead. I started by writing a…

javascript parsing templates lexer tokenize

asked Apr 03 '11 at 19:35

ltimer

Prev 1 2 3

…

69 70 Next