You are here
Home > Automata Theory > The Pumping Lemma for Context-Free Languages

The Pumping Lemma for Context-Free Languages

Pumping Lemma for Context-free Languages

The Pumping Lemma for Context-Free Languages (CFL)

Proving that something is not a context-free language requires either finding a context-free grammar to describe the language or using another proof technique (though the pumping lemma is the most commonly used one). A common lemma to use to prove that a language is not context-free is the Pumping Lemma for Context-Free Languages.

Theorem
The pumping lemma for context-free languages states that if a language L is 
context-free, there exists some integer length p ≥ 1 such that every string s ε L 
has a length of a p or more symbols, |s| ≥ p, that can written s = uvwxy where 
u, v, w, x and y are substrings of s such that:
    • |vwx| ≤ p
    • |vx| ≥ 1
    • uvnwxny ∈ ∀  n ≥ 0

All context-free languages are “pumpable” meaning that the pumping lemma constraints hold true for all context-free languages. If a language is not pumpable, then it is not a context-free language. However, if a language is pumpable, it is not necessarily a context-free language. Because the set of regular languages is contained in the set of context-free languages, all regular languages must be pumpable too.

Essentially, the pumping lemma holds that arbitrarily long strings can be pumped without ever producing a new string that is not in the language .

To prove that a language is not context-free, use proof by contradiction and the pumping lemma. Set up a proof that claims that is context-free​, and show that a contradiction of the pumping lemma’s constraints occurs in at least one of the three constraints listed above.

Basically, the idea behind the pumping lemma for context-free languages is that there are certain constraints a language must adhere to in order to be a context-free language. You can use the pumping lemma to test if all of these constraints hold for a particular language, and if they do not, you can prove with contradiction that the language is not context-free.

Example

Use the Pumping Lemma to prove that L = { anbncn|n>0 } is not a context-free language.

Assume, for the sake of contradiction, that L = {anbncn |n > 0  } is a context-free
language. By the pumping lemma, there exists an integer pumping length p for L. 
We need a string s that is longer than or equal to the length of p. Certainly 
s = apbpcp is longer than p, so we choose this for the s string. This s is in L since 
it has p a's , p b's and p c's.

Now by the pumping lemma, |vwx| ≤ p. There are five possible places in the string that 
we can assign to be vwx:
    • vwx = aj for some j ≤ p. This means that vwx is contained purely in the a’s section.
    • vwx = ajbk for some and  where j+k ≤ p. This means that the vwx segment is contained somewhere in the a’s and b’s section.
    • vwx = bj for some j ≤ p. This means that vwx is contained purely in the b’s section.
    • vwx = bjck for some and  where j+k ≤ p. This means that the vwx segment is contained somewhere in the b’s and c’s section.
    • vwx = cj for some j ≤ p. This means that vwx is contained purely in the c’s section.

In any of these five cases, we can easily verify that the third constraint for the pumping lemma, that uvnwxny ∈ L ∀ n ≥ 0, does not hold. In other words, for any of these five choices of vwx, the string cannot be pumped in a way that results in a string that has an equal number of a’s, b’s and c’s (the definition of the language L).

Read Also: Context Free Languages

Let’s take a short example string described by a5b5c5 = aaaaabbbbbccccc and p = 3.

In the first case, there will be more a’s than there are b’s and c’s, making the resulting, pumped string, not a member of L. If we pump this region, we will get the string aaaaaaaabbbbbccccc: a string with 8 a’s, 5 b’s and 5 c’s. Clearly this is not in the language. A similar proof can be checked for the third and fifth case, just pump the b and c region, respectively, and the results will be symmetrical.

For the second and fourth case, we do something similar. If we pump anywhere in the a and b region only, we will have a resulting string with more a’s and b’s than c’s (for the second case) and more b’s and c’s than a’s (in the fifth case). For the second case, if we take a5b5c5 = aaaaabbbbbccccc and p = 3 and pump the last a in the a section and the first two b’s in the b section, we get this string: aaaaaabbbbbbbccccc — a string with six a’s, seven b’s, and five c’s. The fifth case has a symmetrical example.

Top