Every regular expression describes regular language -

Regular Expression always derives regular language

Every regular expression describes regular language, let R be an arbitrary regular expression over the alphabet Σ. We will prove that the language described by R is a regular language. The proof is by induction on the structure of R.

The first base case of induction: Assume that R = ε. The R describes the language of {ε}. In order to prove that this language is regular, it suffices, by the theorem which says,

Theorem 1: Let A be a language. Then A is regular if and only if there exists a nondeterministic finite automaton that accepts A.

thus, let construct the NFA M = (Q, Σ, δ, q, F) that accepts this language. This NFA is obtained by defining Q={q}, q is the start state, F = {q}, and δ(q,a) = ε, for all a ∈ Σ_ε. The figure below gives the state diagram of M:

The second base case:Assume that R= ε. The R describes the language of {ε}. In order to prove that this language is regular, we know , by theorem 1, which state that if language is regular then it should be accepted by NFA.

So, let construct the NFA M = (Q, Σ, δ, q, F) that accepts this language. This NFA is obtained by defining Q={q}, q is the start state, F = θ, means final state not exist, and δ(q,a) = θ, for all a ∈ Σ_ε. The figure below gives the state diagram of M:

The third base case: Let a ∈ Σ and assume that R = a. The R describes the language of {a}. In order to prove that this language is regular, we know , by theorem 1, which state that if language is regular then it should be accepted by NFA.

So, let construct the NFA M = (Q, Σ, δ, q₁, F) that accepts this language. This NFA is obtained by defining Q={q₁, q₂}, q₁ is the start state, F = {q₂}, and

δ(q₁,a) ={q₂},

δ(q₁,b) = θ for all b ∈ Σ_ε\ {a}

δ(q₁,b) = θ for all b ∈ Σ_ε

The figure below gives the state diagram of M:

The first case of the induction step: Assume that R = R1 ∪ R2, where R1 and R2 are regular expressions. Let L1 and L2 be the languages described by R1 and R2, respectively, and assume that L1 and L2 are regular. Then R describes the language L1 ∪ L2, which, by,

Theorem 2: The set of regular languages is closed under the union operation, i.e., if A1 and A2 are regular languages over the same alphabet Σ, then A1 ∪ A2 is also a regular language.

The second case of the induction step: Assume that R = R1 ∪ R2, where R1 and R2 are regular expressions. Let L1 and L2 be the languages described by R1 and R2, respectively, and assume that L1 and L2 are regular. Then R
describes the language L1 ∪ L2, which, by Theorem 3, is regular.

Theorem 3: The set of regular languages is closed under the concatenation operation, i.e., if A1 and A2 are regular languages over the same alphabet Σ , then A1A2 is also a regular language.

The third case of the induction step: Assume that R = (R1)*, where R1 is a regular expression. Let L1 be the language described by R1 and assume that L1 is regular. Then R describes the language (L1)*, which, by Theorem 4, is regular.

Theorem 4: The set of regular languages is closed under the star (Kleene) operation, i.e., if A is a regular language, then A* is also a regular language.

This concludes the proof of the claim that every regular expression describes a regular language.

Read: Regular Language in Automata Thoery

Share this: