Questions tagged [lexical-analysis]

Process of converting a sequence of characters into a sequence of tokens.

In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function that performs lexical analysis is called a lexical analyzer, lexer, tokenizer, or scanner.

The lexical syntax is usually a regular language, whose atoms are individual characters, while the phrase syntax is usually a context-free language, whose atoms are words (tokens produced by the lexer). While this is a common separation, alternatively, a lexer can be combined with the parser in scannerless parsing.

843 questions
5
votes
3 answers

Using lex generated source code in another file

i would like to use the code generated by lex in another code that i have , but all the examples that i have seen is embedding the main function inside the lex file not the opposite. is it possible to use(include) the c generated file from lex into…
Ahmed Kotb
  • 6,269
  • 6
  • 33
  • 52
5
votes
0 answers

Adding alternate syntax to clang

I ran across the SPECS alternate grammar for C++, and while I'm not sure I like some of the more gratuitous syntax changes they made (changing pointers from * to ^, for instance), it turned me on to the idea of tweaking and implementing the new…
matthias
  • 2,419
  • 1
  • 18
  • 27
5
votes
2 answers

Does the recognition of numbers belong in the scanner or in the parser?

When you look at the EBNF description of a language, you often see a definition for integers and real numbers: integer ::= digit digit* // Accepts numbers with a 0 prefix real ::= integer "." integer (('e'|'E') integer)? (Definitions were…
gnuvince
  • 2,357
  • 20
  • 27
5
votes
3 answers

SELECT* vs SELECT *

Yesterday a colleague showed me the following postgres query. We were both surprised that it worked: SELECT* FROM mytable; Since I recently coded a parser for another language, I am trying to understand in more depth why this query "compiles" and…
Justin Ethier
  • 131,333
  • 52
  • 229
  • 284
5
votes
2 answers

gppg/gplex equivalent in D?

When I was working in C#, I found the gppg and gplex parser/lexer generators to be perfect for my needs. I'm wondering if there's something similar for the D programming language (i.e. a utility that, given a grammar in BNF or EBNF, outputs D code…
Mark LeMoine
  • 4,478
  • 3
  • 31
  • 52
5
votes
3 answers

How do lexical analyzers handle comment and escape sequences?

Comment and escape sequence (such as string literal) are very exceptional from regular symbolic representation. It's hard to understand for me how does regular lexical analyzers tokenize them. How do lexical analyzers like lex, flex, or etc..…
eonil
  • 83,476
  • 81
  • 317
  • 516
5
votes
3 answers

Lexing and Parsing Utilities

I'm looking for lexical analysis and parser-generating utilities that are not Flex or Bison. Requirements: Parser is specified using a context-free LL(*) or GLR grammar. I would also consider PEGs. Integrates tightly with a programming language…
5
votes
0 answers

Passing negative number literals to extension properties/methods in Swift

I wrote a few number extensions for unit conversions, for example: public extension Double { public var dbamp: Double { return pow(10, self/20) } } public extension Int { public var dbamp: Double { return…
PeterT
  • 1,454
  • 1
  • 12
  • 22
5
votes
1 answer

How math operators are identified

How does a simple 2 ++ 2 work behind the scenes in the Python language? If we type this in Python interpreter: >>> 2+++--2 4 >>> 2+++*2 File "", line 1 2++*2 ^ SyntaxError: invalid syntax Looking towards the syntax errors here I…
Shivkumar kondi
  • 6,458
  • 9
  • 31
  • 58
5
votes
4 answers

Context free grammar for languages with more number of as than bs

The question is to develop a context free grammar for language containing all strings having more number of As than Bs. I can't think of a logical solution . Is there a way to approach such problems , what can help me approach such problems better ?…
nino96
  • 395
  • 3
  • 6
  • 17
5
votes
3 answers

C/C++/C#/VB based Lexical Analyzers

During the Compiler Design Lab hours, I'm using JLex as the Lexical Analyzer Generator, which produces a Java program from a lexical specification. I'd like to know if there are other tools which can help me in the same by generating C/C++/C# or VB…
Arjun Vasudevan
  • 792
  • 4
  • 13
  • 33
5
votes
2 answers

In lex, how do I differentiate between '-' (subtraction) operator and an integer '-3'?

I am writing lex for a specific language where operations are carried out in prefix notation : (+ a b) --> (a + b) An integer is defined as follows : An integer can have a negative sign (–) but no positive sign. It can be with or without space(s)…
ronakshah725
  • 290
  • 3
  • 10
5
votes
7 answers

How are if statements in C syntactically unambiguous?

I don't know a whole lot about C, but I understand the basics and as far as I can tell: int main() { if (1 == 1) printf("Hello World!\n"); return 0; } and int main() { if (1 == 1) printf("Hello World!\n"); return 0; } and int main()…
cat
  • 3,888
  • 5
  • 32
  • 61
5
votes
1 answer

What can cause Java compiler to fail while parsing a comment?

The following code is a valid Java program. public class Foo { public static void \u006d\u0061\u0069\u006e(String[] args) { System.out.println("hello, world"); } } The main identifier is written using Unicode escape sequences.…
Susam Pal
  • 32,765
  • 12
  • 81
  • 103
5
votes
4 answers

How to implement a language interpreter without regular expressions?

I am attempting to write an interpreted programming language which will read in files and output a bytecode-like format which can then be executed by a virtual machine. My original plan was: Begin by loading the contents of the file in to the…
Liam Davis
  • 63
  • 3