Questions tagged [regular-language]

Regular language is a language which can be represented by a regular expression and thus every string in the language can be accepted by the corresponding deterministic finite automaton. Note: Regular Language should not be confused with Regular Expressions. For question regarding pattern matching within strings, use the [regex] tag instead.

Given an alphabet (finite set of symbols) Σ, a language is a set of all sequences of such symbols in that alphabet. A language is a regular language exactly when it can be expressed in terms of a (formal) regular expression and the membership of any string can be decided by a finite-state machine.

Regular languages belong to the highest hierarchy of the Chomsky Hierarchy, and are also called Type-3 grammars. They are above the Type-2 context-free languages which are recognized by pushdown automata, which are above the Type-1 context-sensitive languages recognized by linear bounded automata, and above the Type-0 recursively enumerable languages which can be recognized by Turing Machines. All regular languages are context-free, context-sensitive, and recursively enumerable. Formal regular expressions can be converted to deterministic finite state machines and to non deterministic finite machines and still represent the same regular language.

Please do not confuse this with regex. Most regex engines are far more expressive than formal regular expressions, finite state machines, and can represent non-regular languages.

Construction of a Regular Language

The set of all regular languages over a given alphabet Σ can be produced exactly by this process:

  • The empty language {}, rejecting all strings.
  • The language containing only the empty string ε
  • All languages containing only a single symbol s ∈ Σ.
  • Every language created by the union, concatenation, or kleene-star of regular languages. Suppose v and w are strings of a regular language A and B respectively:
    • The union (v|w) is also regular. It accepts languages that are in any of A or B.
    • The concatenation vw is also regular.
    • The kleene-star v* is also regular. It means any copies of strings in A concatenated, including 0.

Examples and Nonexamples of Regular Languages

  • Given a simple alphabet Σ = {0, 1}, where | represents union, * represents kleene-star, these formal regular expressions all represent represents a regular language:

    • The regular expression "0", "1", "(0|1)", "01", "11", "0*" are all regular.
    • The regular expression "(0(0|1)*1)", representing all binary strings beginning with 0 and ending with 1, is regular.
    • Given a regular expression R, the language "R+" and "R?" all represent a regular language, whereas + represents one or more, and ? represents zero or one. Namely, "R+" is equivalent to "RR*", and "R?" is equivalent to "(R|ε)".
    • Given a regular expression R, the language "R{m,n}" is regular for all natural m,n, where {m,n} represents "from m copies to n copies". This is because it also involves union and concatenation: "R{1,3}" is expanded to "(R|RR|RRR)".
  • Given an alphabet used by regex engines, usually an ASCII or Unicode alphabet containing all ASCII or Unicode characters respectively:

    • The regex /^.+$/ is regular. It includes all non-empty sequences of any character.
    • The regex /^#[A-Za-z]{1,3}[0-9]{2,4}$/ represents a regular language, consisting all strings which being with a hashtag, then one to three ASCII letters, followed by two to four decimal digits.
    • The regex /^([\d][\w])*$/ represents a regular language. It consists all strings which alternate digit characters and word characters. The shorthand \d and \w are examples of union.
  • Many regex engines are much more expressive than regular languages. Backreferences can cause a regex to represent a non-regular language, and consequently they cannot be decided by a finite state machine.

    • The regex "(.+)\1" represents an irregular language. Involving a backreference capturing the first group .+, it accepts all the sequences of uppercase Latin letters repeated exactly twice. They are called squares in formal language theory.
      • "ABCABC", "1234.1234." are accepted
      • "ABCAB", "1234567891234567890" are rejected.

Further Reading

914 questions
-3
votes
1 answer

Construct a CFG for Regular Expression

i had problem with my homework, can someone explane me for the question below. thank you Construct a CFG for RE below : a* b* (a│c)* (a│c)* ba
-3
votes
1 answer

Regular expression for phone number with special characters

I need a regular expression for phone numbers. The phone numbers may contain special characters like +, ., /, -, space, (, ), [, ]. Some Examples: (+91)…
-3
votes
1 answer

Replace every occurrence of the regular expression matching with a particular in VI editor?

suppose I have a text file as follows. create table "kevin".tb1 { col1, col2 } create table "jhone".tb2 { col1, col2 } create table "jake".tb3 { col1, col2 } I need to obtain that text file as follows by replacing every table owner name…
Nwn sn
  • 1
  • 4
-3
votes
2 answers

How to get a part of string matching a regular expression (c#)?

I have some strings like " 8 2 / 5", "5 5/ 7" and so on, containg math fractions. I need to convert them into doubles. So, how can I get these parts " 8 " "2 / 5", into variables? var src = " 8 2 / 5"; string base = getBase(src); string frac =…
Tomcat
  • 35
  • 5
-3
votes
1 answer

Understand regular expressions?

I am trying to understand regular expressions. I am trying to parse data from XML web service using regular expressions. I need your help to understand few regular expressions. Regular expressions that I need to understand. 1:…
WasimSafdar
  • 1,044
  • 3
  • 16
  • 39
-3
votes
2 answers

Python regex to remove tuples from the text

I'm working on the text processing and I need to remove all the tuples from the text, tuples can have arbitrary number of elements (e.g. () or (1,2,3)) ,but the elements will always be integers. Can somebody help me to write regex for this, I'm…
starwarrior8809
  • 361
  • 1
  • 10
-3
votes
2 answers

Specific password regular expression

I am having problems creating a regular expresion. It needs to fullfill the following: 1) Has 8-12 characters 2) At least 1 uppercase letter 3) At least 3 lowercase letters 4) At least 1 number 5) At least 1 special character 6) Has to start with a…
-3
votes
2 answers

Regular expression (regex) help: Need to delete everything on several lines after an IP address

I have several lines that are 192.168.86.3 0x1 0x2 3cbbaxrad * br-lan 192.168.86.213 0x1 0x2 3cccfargarad * br-lan 192.168.86.51 0x1 0x2 3cccfcvrad * br-lan 192.168.86.11 0x1 0x2 3cccfxxrad * br-lan I need to extract…
-3
votes
2 answers

Plain english description for this regular expression

What would be a a description in words of the general string that this RE accepts? (01(10)*11)* I first thought that it was just any string beginning and ending with 01 and 11 respectively, with any number of alternating 1s and 0s inbetween, but…
KOB
  • 4,084
  • 9
  • 44
  • 88
-3
votes
1 answer

RegEx to remove DOT

I want to remove dot (.) but I don't know how can I write this pattern. I have some text line this. E-1-2-3.1-0-0 or E-1-2-0-2.5-0 or E-1-2-0-3.5-0 But in my text some are numbers like 2.5, 56.7. I don't want to remove these dots are they are…
Hus R Kozk
  • 17
  • 1
  • 3
-3
votes
1 answer

How to write a RegExp to match the Nth character

There is a strings as below: 2016,07,20,19,20,25 How can I transfer this strings to such as this format string: 2016-07-20 19:20:25 Thank you very much!
Jacky Kwan
  • 61
  • 10
-3
votes
3 answers

Regex for validating path

How do I write a regex that passes for below conditions in C# \segment\segment\ a) each segment starts and ends with a backslash b) segment can be alpha-numeric with dashes, underscore and period allowed (e.g. \some-name\some.other_name\ ) c) the…
Frank Q.
  • 6,001
  • 11
  • 47
  • 62
-3
votes
1 answer

what is the regex for matching only #hashtags?

i am working on a social networking application and it have hash tag feature. i want to match all #tags but not #[[123:hashTag:rameez]] . i know reg-ex for both separately how can i do it in a single reg-ex?
Rameez Rami
  • 5,322
  • 2
  • 29
  • 36
-3
votes
1 answer

Regex: Find words with multiple periods

I am given a task that will require regex to find a string out of a paragraph. I need to find a string that looks something.like.this but is not limited to looking.something.like.this.also. Using the paragraph above as an example, this expression…
Bernie
  • 107
  • 1
  • 2
  • 9
-3
votes
1 answer

Pickup sentences between two string by regular expression in iOS

I have the following sentence: distributed over a considerable extent; "far-flung trading operations"; "the West's far-flung mountain ranges"; "widespread nuclear fallout" What I want is to pickup the sentence between "**********"; My regular…
Bagusflyer
  • 12,675
  • 21
  • 96
  • 179
1 2 3
60
61