You are here
Home > Automata Theory > Every regular expression describes regular language

Every regular expression describes regular language

Regular Expression always derives regular language

Every regular expression describes regular language, let R be an arbitrary regular expression over the alphabet Σ. We will prove that the language described by R is a regular language. The proof is by induction on the structure of R.

The first base case of induction: Assume that R = ε.  The describes the language of {ε}. In order to prove that this language is regular, it suffices, by the theorem which says,

Theorem 1:  Let A be a language. Then A is regular if and only if there exists a nondeterministic finite automaton that accepts A.

thus, let construct the NFA M = (Q, Σ, δ, q, F) that accepts this language. This NFA is obtained by defining Q={q}, q is the start state, F = {q}, and δ(q,a) = ε,  for all a ∈ Σε . The figure below gives the state diagram of M:

Show the start and final state of NFA

The second base case:Assume that R= ε. The describes the language of {ε}. In order to prove that this language is regular, we know , by theorem 1, which state that if language is regular then it should be accepted by NFA.

So, let construct the NFA M = (Q, Σ, δ, q, F) that accepts this language. This NFA is obtained by defining Q={q}, q is the start state, F = θ, means final state not exist, and δ(q,a) = θ,  for all a ∈ Σε . The figure below gives the state diagram of M:

Start state of Non Deterministic Finite Automata

The third base case: Let a ∈ Σ and assume that R = a. The describes the language of {a}. In order to prove that this language is regular, we know , by theorem 1, which state that if language is regular then it should be accepted by NFA.

So, let construct the NFA M = (Q, Σ, δ, q1, F) that accepts this language. This NFA is obtained by defining Q={q1, q2}, q1 is the start state, F = {q2},  and

δ(q1,a) ={q2},

δ(q1,b) = θ for all b ∈ Σε \ {a}

δ(q1,b) = θ for all b ∈ Σε

The figure below gives the state diagram of M:

NFA state diagram with input

The first case of the induction step: Assume that R = R1 ∪ R2, where R1 and R2 are regular expressions. Let L1 and L2 be the languages described by R1 and R2, respectively, and assume that L1 and L2 are regular. Then R describes the language L1 ∪ L2, which, by,

Theorem 2: The set of regular languages is closed under the union operation, i.e., if A1 and A2 are regular languages over the same alphabet Σ, then A1 ∪ A2 is also a regular language.

The second case of the induction step: Assume that R = R1 ∪ R2, where R1 and R2 are regular expressions. Let L1 and L2 be the languages described by R1 and R2, respectively, and assume that L1 and L2 are regular. Then R
describes the language L1 ∪ L2, which, by Theorem 3, is regular.

Theorem 3: The set of regular languages is closed under the concatenation operation, i.e., if A1 and A2  are regular languages over the same alphabet Σ , then A1A2 is also a regular language.

The third case of the induction step: Assume that R = (R1)*, where R1 is a regular expression. Let L1 be the language described by R1 and assume that L1 is regular. Then R describes the language (L1)*, which, by Theorem 4, is regular.

Theorem 4: The set of regular languages is closed under the star (Kleene) operation, i.e., if A is a regular language, then A* is  also a regular language.

This concludes the proof of the claim that every regular expression describes a regular language.

Read: Regular Language in Automata Thoery

Top