Questions tagged [lexical-analysis]

Process of converting a sequence of characters into a sequence of tokens.

In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function that performs lexical analysis is called a lexical analyzer, lexer, tokenizer, or scanner.

The lexical syntax is usually a regular language, whose atoms are individual characters, while the phrase syntax is usually a context-free language, whose atoms are words (tokens produced by the lexer). While this is a common separation, alternatively, a lexer can be combined with the parser in scannerless parsing.

843 questions

votes

2 answers

Arabic lemmatization and Stanford NLP

I try to make lemmatization, ie identifying the lemma and possibly the Arabic root of a verb, for example: يتصل ==> lemma (infinitive of the verb) ==> اتصل ==> root (triliteral root / Jidr thoulathi) ==> و ص ل Do you think Stanford NLP can do…

asked Mar 19 '15 at 17:33

Riadh Belkebir

votes

4 answers

Responsibilities of the Lexer and the Parser

I'm currently implementing a lexer for a simple programming language. So far, I can tokenize identifiers, assignment symbols, and integer literals correctly; in general, whitespace is insignificant. For the input foo = 42, three tokens are…

parsing compiler-construction tokenize lexical-analysis

asked May 13 '14 at 00:47

Marius Schulz

15,976
12
63
97

votes

1 answer

Bison-Flex extern FILE *yyin isn't working (C language)

I know that in flex you just have to do yyin = fopen(filename, "r"); to read a file but if you want to do it from bison how is it possible? I'm trying to combine flex and bison for my purpose(read a file with 4 + 5 + 7; and print the outcome) but I…

c parsing bison flex-lexer lexical-analysis

asked Feb 20 '14 at 01:34

captain monk

votes

1 answer

Flex/bison syntax error

I am trying to write a grammar which will be able to consume the following input: begin #this is a example x = 56; while x > 0 do begin point 15.6 78.96; end; end; Here is the lexer.l file: %option noyywrap %{ #include…

bison yacc lex flex-lexer lexical-analysis

asked Nov 02 '13 at 15:09

Vardan Hovhannisyan

1,101
3
17
40

votes

3 answers

Removing nested comments bz lex

How should I do program in lex (or flex) for removing nested comments from text and print just the text which is not in comments? I should probably somehow recognize states when I am in comment and number of starting "tags" of block comment. Lets…

comments lex flex-lexer lexical-analysis

asked Oct 17 '12 at 20:56

user1097772

3,499
15
59
95

votes

3 answers

What is the purpose of a lexer?

I was reading the answer to this question. I can't seem to find the answer to why someone would need a lexer separately Is it one of the steps a program goes through during compilation? Can someone please explain in simple terms why I would need a…

compiler-construction lexical-analysis

asked Jul 07 '12 at 15:07

Anirudh Ramanathan

46,179
22
132
191

votes

5 answers

What is the lexical and syntactic analysis during the process of compiling in C Compiler?

What is the lexical and syntactic analysis during the process of compiling. Does the preprocessing happens after lexical and syntactic analysis ?

c parsing compilation preprocessor lexical-analysis

asked Jun 23 '12 at 19:18

Raulp

7,758
20
93
155

votes

6 answers

Is the C++ compiler really smart enough to distinguish between multiply and dereference?

I have the following line of code: double *resultOfMultiplication = new double(*num1 * *num2); How does the compiler know which * is used for derefencing and which * is used for multiplication? Also, and probably a more important question is in…

c++ lexical-analysis

asked Feb 12 '12 at 01:01

Nosrettap

10,940
23
85
140

votes

2 answers

How to make a flex (lexical scanner) to read UTF-8 characters input?

It seems that flex doesn't support UTF-8 input. Whenever the scanner encounter a non-ASCII char, it stops scanning as if it was an EOF. Is there a way to force flex to eat my UTF-8 chars? I don't want it to actually match UTF-8 chars, just eat…

utf-8 lexical-analysis flex-lexer

asked May 28 '09 at 15:54

Martin Cote

28,864
15
75
99

votes

3 answers

Expression parsing: how to tokenize

I'm looking to tokenize Java/Javascript-like expressions in Javascript code. My input will be a string containing the expression, and the output needs to be an array of tokens. What's the best practice for doing something like this? Do I need to…

javascript regex parsing expression lexical-analysis

asked May 22 '09 at 17:30

levik

114,835
27
73
90

votes

2 answers

Simple lexical analysis java program

My little project is a lexical analysis program in which i have to take every word found in an arbitrary .java file and list every line it appears on in the file. I need to have one look up table dedicated just to the reserved words and another for…

java lexical-analysis lookup-tables

asked Jan 17 '12 at 00:49

user1152918

votes

2 answers

java library to parse regular expressions into a syntax tree

I'd like a library that can take the string representation of a regexp and convert that into a syntax tree for easy programmatic manipulation. Something that would transform: (\s?)bla[a-z] into something like: PARENTHESIS CHAR:SPACE …

java regex lexical-analysis

asked Jan 02 '12 at 03:41

jp.

votes

3 answers

How to capture a string without quote characters

I'm trying to capture quoted strings without the quotes. I have this terminal %token STRING and this production constant: | QUOTE STRING QUOTE { String($2) } along with these lexer rules | '\'' { QUOTE } | [^ '\'']* { STRING…

parsing f# lexical-analysis fsyacc fslex

asked Nov 21 '11 at 18:11

Daniel

47,404
11
101
179

votes

1 answer

Profiling Regex Lexer

I've created a router in PHP which takes a DSL (based on the Rails 3 route) and converts it to Regex. It has optional segments (denoted by (nested) parenthesis). The following is the current lexing algorithm: private function…

php lexical-analysis xhprof

asked Aug 18 '11 at 23:15

efritz

5,125
4
24
33

votes

4 answers

Is this the job of the lexer?

Let's say I was lexing a ruby method definition: def print_greeting(greeting = "hi") end Is it the lexer's job to maintain state and emit relevant tokens, or should it be relatively dumb? Notice in the above example the greeting param has a…

parsing compiler-construction tokenize lexical-analysis

asked Jun 15 '11 at 14:34

ryeguy

65,519
58
198
260

Prev 1 2 3

…

56 57 Next