Questions tagged [lexical-analysis]

Process of converting a sequence of characters into a sequence of tokens.

In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function that performs lexical analysis is called a lexical analyzer, lexer, tokenizer, or scanner.

The lexical syntax is usually a regular language, whose atoms are individual characters, while the phrase syntax is usually a context-free language, whose atoms are words (tokens produced by the lexer). While this is a common separation, alternatively, a lexer can be combined with the parser in scannerless parsing.

843 questions
0
votes
2 answers

Lexical Analyzer code in C

im using visual studio 2010 for C++ i implemented a code for a lexical analyzer for C programming but i got an error says "missing type specifier - int assumed. Note: C++ does not support default-int" is there something wrong with my code? or i…
Lu Yas
  • 137
  • 2
  • 5
  • 16
0
votes
2 answers

how to analyze content of textareas to display links and images?

i'm building a simple site where users can post comments about "things". What I want to do is in some way "analyze" the comment they post to detect Links (and provide tags) images (maybe display previews), videos, etc. I'm building the site with…
santiagobasulto
  • 11,320
  • 11
  • 64
  • 88
-1
votes
4 answers

Where might I obtain a lexical analyzer capable of reporting for-loop errors in C or C++?

I need a simple lexical analyzer that reports for-loop errors in C/C++.
addis
-1
votes
1 answer

Tilde accent marks alex

I'm building a compiler in Haskell. I have problems parsing characters with tilde accent marks. I'm using alex 2.3.3. I can't find a solution. Please help.
Academia
  • 3,984
  • 6
  • 32
  • 49
-1
votes
1 answer

Avoiding overlap with similar regex patterns during tokenization

Background I've made a couple simple compilers before, but I've never properly addressed this issue: Say I have a token LT which searches the expression < and a token LTEQ which searches <=. A LT would match part of <= in this case, and I don't want…
Jam
  • 476
  • 3
  • 9
-1
votes
2 answers

How can I scan a file with no delimiter between tokens in Java?

I have text input which looks like this: !10#6#4!3#4 I have two patterns for the two types of data found in the input above: Pattern totalPattern = Pattern.compile("![0-9]+"); Pattern valuePattern = Pattern.compile("#[0-9]+"); I wanted to get the…
redmoncoreyl
  • 133
  • 9
-1
votes
2 answers

Getting wrong output for a++ +b according to lexical analysis when the program is printed along with a+++b

I wrote the following C program to find the output for a+++b #include int main() { int a=5, b=2; printf("%d",a+++b); } And I'm getting the output as 7 which is correct according to lexical analysis. Apart from that I wrote a…
-1
votes
1 answer

Regular expression for email in flex

I am trying to wirte a regular expression for emails in JFlex. So far I tried with this L=[a-zA-Z_]+ D=[0-9]+ email=[^(.+)@(\S+)$] %{ public String lexeme; %} %% {L}({L}|{D})* {lexeme=yytext(); return Identi;} ("(-"{D}+")")|{D}+…
coding2
  • 47
  • 6
-1
votes
1 answer

Regex in c++ for maching some patters

I want regex of this. add x2, x1, x0 is a valid instruction; I want to implement this. But bit confused, how to, as I am newbie in using Regex. Can anyone share these Regex?
Fahmida
  • 1,050
  • 8
  • 19
-1
votes
1 answer

based word for battling and lemmatization

All, What is the base form of battling? Lemmatization results in battling where as I think it should be battle. Is my understanding of lemmatization wrong? from nltk import download download('wordnet') from nltk.stem.wordnet import…
-1
votes
1 answer

Flex C++ print code in generated file and add color

I'm trying to insert the code that flex reads into my .tex file, this console app is supposed to take a .pascal and analyze it and then generate a .tex file but I'm not able to pass the code to the .tex file and then I need to add color to each…
AIAM2601
  • 3
  • 2
-1
votes
1 answer

What does [^0-9]+$ mean (regular expression in FLEX)

This is what I know: ^ inside brackets matches a character that isn't one of the included inside the brackets. + Matches one or more appearances of the expression to its left (in my ex. [^0-9]). $ If I'm not mistaken, matches to an expression that…
user11541813
-1
votes
2 answers

antlr 4 lexer rule RULE: ''; isn't recognized as token but if fragment rule then recognized

EDIT: I've been asked if I can provide the full grammar. I cannot and here is the reason why: I cannot provide my full grammar code because it is homework and I am not allowed to disclose my solution, and I will sadly understand if my question…
-1
votes
1 answer

Extract variable names from java expression dinamically

I have dynamic set of string expressions(java code parts) eg : val1.subtract(val2.divide(val3,6,java.math.RoundingMode.HALF_UP) ,new MathContext(6, java.math.RoundingMode.HALF_UP)).setScale(6, BigDecimal.ROUND_HALF_UP) I want to extract variable…
Nilanka Manoj
  • 3,527
  • 4
  • 17
  • 48
-1
votes
1 answer

How to develop a lexical analyzer with javascript?

I developed a lexical analyzer function which gets a string and separate the items in string in an array like this : const lexer = (str) => str .split(" ") .map((s) => s.trim()) .filter((s) => s.length); console.log(lexer("John Doe"))…
Mehdi Faraji
  • 2,574
  • 8
  • 28
  • 76