Questions tagged [regular-language]

Regular language is a language which can be represented by a regular expression and thus every string in the language can be accepted by the corresponding deterministic finite automaton. Note: Regular Language should not be confused with Regular Expressions. For question regarding pattern matching within strings, use the [regex] tag instead.

Given an alphabet (finite set of symbols) Σ, a language is a set of all sequences of such symbols in that alphabet. A language is a regular language exactly when it can be expressed in terms of a (formal) regular expression and the membership of any string can be decided by a finite-state machine.

Regular languages belong to the highest hierarchy of the Chomsky Hierarchy, and are also called Type-3 grammars. They are above the Type-2 context-free languages which are recognized by pushdown automata, which are above the Type-1 context-sensitive languages recognized by linear bounded automata, and above the Type-0 recursively enumerable languages which can be recognized by Turing Machines. All regular languages are context-free, context-sensitive, and recursively enumerable. Formal regular expressions can be converted to deterministic finite state machines and to non deterministic finite machines and still represent the same regular language.

Please do not confuse this with regex. Most regex engines are far more expressive than formal regular expressions, finite state machines, and can represent non-regular languages.

Construction of a Regular Language

The set of all regular languages over a given alphabet Σ can be produced exactly by this process:

The empty language {}, rejecting all strings.
The language containing only the empty string ε
All languages containing only a single symbol s ∈ Σ.
Every language created by the union, concatenation, or kleene-star of regular languages. Suppose v and w are strings of a regular language A and B respectively:
- The union (v|w) is also regular. It accepts languages that are in any of A or B.
- The concatenation vw is also regular.
- The kleene-star v* is also regular. It means any copies of strings in A concatenated, including 0.

Examples and Nonexamples of Regular Languages

Given a simple alphabet Σ = {0, 1}, where | represents union, * represents kleene-star, these formal regular expressions all represent represents a regular language:
- The regular expression "0", "1", "(0|1)", "01", "11", "0*" are all regular.
- The regular expression "(0(0|1)*1)", representing all binary strings beginning with 0 and ending with 1, is regular.
- Given a regular expression R, the language "R+" and "R?" all represent a regular language, whereas + represents one or more, and ? represents zero or one. Namely, "R+" is equivalent to "RR*", and "R?" is equivalent to "(R|ε)".
- Given a regular expression R, the language "R{m,n}" is regular for all natural m,n, where {m,n} represents "from m copies to n copies". This is because it also involves union and concatenation: "R{1,3}" is expanded to "(R|RR|RRR)".
Given an alphabet used by regex engines, usually an ASCII or Unicode alphabet containing all ASCII or Unicode characters respectively:
- The regex /^.+$/ is regular. It includes all non-empty sequences of any character.
- The regex /^#[A-Za-z]{1,3}[0-9]{2,4}$/ represents a regular language, consisting all strings which being with a hashtag, then one to three ASCII letters, followed by two to four decimal digits.
- The regex /^([\d][\w])*$/ represents a regular language. It consists all strings which alternate digit characters and word characters. The shorthand \d and \w are examples of union.
Many regex engines are much more expressive than regular languages. Backreferences can cause a regex to represent a non-regular language, and consequently they cannot be decided by a finite state machine.
- The regex "(.+)\1" represents an irregular language. Involving a backreference capturing the first group .+, it accepts all the sequences of uppercase Latin letters repeated exactly twice. They are called squares in formal language theory.
  - "ABCABC", "1234.1234." are accepted
  - "ABCAB", "1234567891234567890" are rejected.

regex: Short for regular expression. Many regex engines nowadays are far more expressive than formal regular expressions and finite-state machines.
finite-state-machine: They are equivalent in expressiveness to formal regular expressions. They represent exactly the regular languages. Acronyms include FSM.
dfa: Deterministic finite automaton. nfa Nondeterministic finite automaton.
- Both are equivalent in expressiveness.
context-free-grammar, context-sensitive-grammar: Tags referring to lower levels of the Chomsky hierarchy
Wikipedia, includes explanation about squares and irregular regexes
What is a regular language?

914 questions

votes

2 answers

how to add conditions for regular expression?

I have to get the string which start with =" and ends with next ". But it should contains < symbol. (="([^"])*<*") String: dit niet "dit wel" dit ook niet ="maar

javascript java regex string regular-language

asked Sep 14 '16 at 09:23

Nirmal Srinath

votes

2 answers

What will the '\$' regular expression match?

I have found that, $ : Matches the end of the line \s: Matches whitespace \S: Matches any non-whitespace character But what exactly does \$ do ?

regex regular-language

asked Aug 13 '16 at 17:08

MAX

1,562
4
17
25

votes

2 answers

Can Lua patterns represent any regular language?

This question does not ask if Lua patterns are PCRE. That has been asked multiple times and the answer is definitely no. Instead, I am asking if Lua patterns have an analogy to regular languages by the formal language definition. My instinct is no…

lua regular-language formal-languages

asked Jun 05 '16 at 02:25

Ryan

2,378
1
19
29

votes

2 answers

Negative pattern matching Reg ex In Python

Tryto use negative forward to replace all string which does not match a pattern: regexPattern = '((?!*' + 'word1|word2|word3' + ').)*$' mytext= 'jsdjsqd word1dsqsqsword2fjsdjswrod3sqdq' return re.sub(regexPattern, "P", mytext) #Expected Correct…

python regex python-3.x regular-language

asked Mar 30 '16 at 12:45

user5497885

votes

1 answer

Show that L ={ ww^R : w ∈ Σ*} is not regular by using Pumping Lemma

If I let string w be a^mb^m then we know that y will consists of only a's because of the rule |xy| <= m. And if I set i=0, then ww^R will have fewer a's on the left side than on the right side. Thus, it proves that this language is not…

regex regular-language pumping-lemma

asked Feb 27 '16 at 23:13

Mint.K

votes

1 answer

Are L1 = {a^n b^n | n < 4 } and L2 = {a^n b^n | n < 10^10^10 }, regular languages?

Is L1 = {a^n b^n | n < 4 }, a regular language ? In my opinion, it is regular, as I could draw an FSA for it, however, in class, my professor had taken an example, L2 = {a^n b^n | n < 10^10^10 } and said, this is not regular... so, my question is,…

computer-science regular-language fsm

asked Jan 21 '16 at 13:51

Aarjavee S. Kamdar

votes

0 answers

How to remember NFA's choice on a certain computation?

I'm working on solving the question answered at this page but with different values at the table, my alphabet is {a,b,c} Words that have the same right- and left-associative product Currently I'm in the stage where I have drawn the DFA of the…

regular-language automata formal-languages non-deterministic

asked Nov 21 '15 at 09:11

CSGuy

votes

3 answers

Regex only 14 numbers

I have the following text: DiretorioXmlImpressao=C:\\Program Files (x86)\\TESTE\\XmlImpressao\\08187168000160\\ I would like to select all but the CNPJ(14-character sentence at the end of the text), and so I tried the following regular…

regex regular-language

asked Nov 15 '15 at 14:07

Thiago Ribeiro

votes

2 answers

Same regex but giving different result with StringTokenizer and Scanner class delimiter

Im trying to separate each word in the sentence using StringTokenizer class. It works fine for me. But I found another solution to my case using Scanner class.I applied same regular expression in both ways but got different result. I would like to…

java regex regular-language

asked Nov 09 '15 at 12:37

Madushan Perera

2,568
2
17
36

votes

4 answers

Regex to find string where parenthesis are not closed

I need a regex to find strings where parenthesis are not closed Example: 02 Back for Good (Radio Mix.mp3 Find "(Radio Mix" But if 02 Back for Good (Radio Mix).mp3 Must find nothing

regex preg-replace preg-match regular-language

asked Oct 22 '15 at 08:25

David Zeller

votes

1 answer

Will L={xww^R| w, x belongs to {0,1}^+ } is a regular language or not

I have already seen that wxw^r is regular as explained in this post Why L={wxw^R| w, x belongs to {a,b}^+ } is a regular language if i apply the same logic here that w will eat up everything except the last two symbols which can be either 0 or 1…

automation regular-language

asked Aug 23 '15 at 10:54

SOURAV KABIRAJ

votes

1 answer

How to determine if a context-free grammar describes a regular language?

Given an arbitrary context-free grammar, how can I check whether it describes a regular language? I'm not looking for exam "tricks". I'm looking for a foolproof mechanical test that I can code. If it helps, here's an example of a CFG that I might…

grammar context-free-grammar regular-language finite-automata formal-languages

asked Jul 27 '15 at 00:24

user541686

205,094
128
528
886

votes

1 answer

How do you classify languages into regular, context free, and phrase-structure?

If you're given a language, how do you figure out if it's regular, CF but not regular, or phrase-structure but not CF? Is there a good way to attack this problem? I could randomly try to make FAs or PDAs, but I feel like there's a better way to do…

context-free-grammar language-theory regular-language

asked May 11 '10 at 01:03

confused

votes

1 answer

Regular Languages and Concatenation

Regular languages are closed under concatenation - this is demonstrable by having the accepting state(s) of one language with an epsilon transition to the start state of the next language. If we consider the language L = {a^n | n >=0}, this language…

concatenation regular-language finite-automata fsm

asked Oct 25 '14 at 18:13

CharlotteA

votes

1 answer

Converting regex to a regular grammar/right-linear grammar

I would like to verify that I am converting this regex to a right-linear grammar correctly based on the information from this previous question and the wonderful answer by Grijesh: Left-Linear and Right-Linear Grammars Here is the question: "Write a…

regex grammar regular-language formal-languages

asked Oct 07 '14 at 00:30

Programmer

Prev 1 2 3

…

60 61 Next

Questions tagged [regular-language]

Construction of a Regular Language

Examples and Nonexamples of Regular Languages

Further Reading