Dec 3, 2013

Machine-Based Approach to Pumping

(January 1, 2014: The statement of "Harrison's theorem" in the original post was in error. The error is now corrected. See here.)

(December 4, 2013: Scroll to the end of the post for an update.)

When I was a first-year Ph.D. student at Stanford, I took an automata theory course offered by the computer science department, taught by the late Robert Floyd. The course used a draft version of a textbook he and Richard Beigel were writing, later published as The Language of Machines in 1994. The book, presumably influenced by an old paper by Dana Scott, adopted a rather unconventional approach to automata theory. A clear separation between machines and programs and abstract notions of machines were two of such unconventional features. I quite liked the book, but it would probably be good to have another textbook with more standard notations and terminology, such as Michael Sipser's.

Anyway, one of the interesting things about the draft of Floyd and Beigel's book was that they gave a proof of the pumping lemma for context-free languages using pushdown automata, rather than context-free grammars. I had already seen the standard, grammar-based proof, but I was fascinated by the PDA-based proof in the book. In the grammar-based proof, you look at a long path in a derivation tree that has two occurrences of the same nonterminal on it. In the machine-based proof, you look at a computation in which the stack exceeds a certain height. With the definition of the PDA adopted in the book, where you can push a symbol on the stack regardless of the top symbol of the stack, the height to be exceeded is simply \(|Q|^2\), where \(Q\) is the state set of the PDA.

A tricky thing about this proof is that it won't do to just take a very long string, because a PDA can run indefinitely long without the stack ever exceeding a given height. But this is OK because the set of strings that a PDA accepts without the stack height ever exceeding \(h\) is regular, no matter what \(h\) is. In the margin next to the proof in Floyd and Beigel's draft, there was a note that said, "A hard exercise for a machine-based proof." Alas, the proof was replaced by the standard, grammar-based proof in the published version of the book.

I was always fond of this machine-based proof, and I even mentioned it when I presented my paper on the pumping lemma for well-nested multiple context-free languages (PDF), first in Haifa in 2008, and then in Stuttgart in 2009. Unfortunately, I didn't keep the draft version of the book, but I'm fairly confident my reconstruction of the proof is the same as the one Floyd and Beigel had in the draft.

Recently, I had a chance to think about this proof again, when I was writing my just-finished paper for the 19th Amsterdam Colloquium. The proof of one direction of the main result of this paper uses a pumping lemma for deterministic context-free languages from Michael Harrison's book (Theorems 11.8.2 and 11.8.3). (It seems the result first appeared in Havel and Harrison's 1974 paper.)

Harrison proves his pumping lemma by analyzing derivation trees of strict deterministic grammars. I confess to not having read his proof, but it seems to me that it's much easier to prove it using deterministic PDA. In the case of deterministic PDA, you can't assume the stack is emptied before acceptance, so the bound to be exceeded in the proof is not about the stack height, but the decrease in stack height. Also, if you work with a model of PDA where the stack action is always sensitive to the top symbol of the stack, the bound becomes \((|Q|\cdot|\Gamma|)^2\), rather than \(|Q|^2\), where \(\Gamma\) is the stack alphabet. Other than that, the idea is essentially the same, and you can get a strong pumping property of the DCFLs easily.

I actually wrote up this machine-based proof of Harrison's pumping lemma for my paper, but I had to cut it because the paper otherwise didn't fit within the 8-page limit.

The thing is I've never seen the machine-based proof of the pumping lemma for CFL or DCFL anywhere else, and I don't know if it ever appeared in print. I noticed that Havel and Harrison refer to this 1969 paper by William Ogden. This paper is mostly about stack languages, and the author doesn't say much about PDA in it (no explicit bound to be exceeded is mentioned), but his 1968 dissertation is titled "Intercalation theorems for pushdown store and stack languages", so it probably has a more detailed treatment there. I can't find Ogden's dissertation anywhere on the web, but I hope to obtain a copy at some point. (Incidentally, Ogden's Ph.D. advisor was Dana Scott, and I think Robert Floyd became a professor at Stanford in 1969, so there's the connection.)

(Note: The proof of the pumping lemma for linear indexed languages uses a similar argument. See this paper by Shamir or here (Lecture 4).)

You can stop reading now (if you haven't already), but if you're interested, here's my write-up of the machine-based proof of (a corollary of) Harrison's theorem. (Thanks to MathJax, it's now fairly easy to copy-and-paste from a LaTeX source.)

(January 1, 2014: The statement of the following theorem in the original post was in error.)

Theorem (Harrison). Let $L \subseteq \Sigma^{\ast}$ be a DCFL. There exists a regular set $R \subseteq L$ satisfying the following property: for every $w \in L-R$, there exist $x_1,x_2,x_3,x_4,x_5$ such that
  1. $w = x_1 x_2 x_3 x_4 x_5$;
  2. $x_2 x_4 \neq \epsilon$;
  3. for every $z \in \Sigma^{\ast}$ and $n \in \mathbb{N}$, $x_1 x_2 x_3 x_4 z \in L$ if and only if $x_1 x_2^n x_3 x_4^n z \in L$.
In the following proof, we write $(q,\gamma) \stackrel{a}{\Longrightarrow}_1 (q',\gamma')$ for $(a,q,\gamma) \vdash_M (\epsilon,q',\gamma')$, and $(q,\gamma) \stackrel{w}{\Longrightarrow} (q',\gamma')$ for $(w,q,\gamma) \vdash_M^{\ast} (\epsilon,q',\gamma')$. Note that $(q,\gamma) \stackrel{w}{\Longrightarrow} (q',\gamma') \stackrel{w'}{\Longrightarrow} (q'',\gamma'')$ implies $(q,\gamma) \stackrel{ww'}{\Longrightarrow} (q'',\gamma'')$.

Let $M = (Q,\Sigma,\Gamma,\delta,q_0,Z_0,F)$ be a deterministic PDA that recognizes $L$. We may suppose without loss of generality that that the computation of $M$ never blocks or enters into an infinite loop, and $(p,\gamma) \in \delta(q,a,Y)$ implies $\gamma \in \{ \epsilon \} \cup \Gamma \cup \Gamma Y$. Let $h = (|Q| \cdot |\Gamma|)^2$. We divide $L$ into two sets $L_0, L_1$, where $L_0$ is a regular set, as follows.
\begin{gather*} L_1 = \{\, w \in L \mid w = w_1 w_2 w_3, (q_0,Z_0) \stackrel{w_1}{\Longrightarrow} (q,\alpha\beta) \stackrel{w_2}{\Longrightarrow} (r,\beta), |\alpha| = h, |\beta| \geq 1 \,\},\\ L_0 = L - L_1. \end{gather*} The reason that $L_0$ is regular is that if the stack height never shrinks by more than $h$ symbols, then all but the top $h$ symbols on the stack can be discarded, and the resulting bounded-height stack can be simulated by the finite control.

Suppose $w \in L_1$. Let $m$ be the number of steps that $M$ takes until it accepts $w$. Let $\gamma_0 = Z_0$, the initial stack content, and for each $i=1,\dots,m$, let $q_i$ and $\gamma_i$ be the state and the stack content of $M$ after the $i$th step. We can write $w = a_1 \dots a_m$, with $a_i \in \Sigma \cup \{ \epsilon \}$, so that
\[ (q_{i-1},\gamma_{i-1}) \stackrel{a_i}{\Longrightarrow}_1 (q_i,\gamma_i). \] Since $w \in L_1$, we must have $\gamma_k = \alpha \gamma_l$ and $\gamma_l \neq \epsilon$ for some $k < l \leq m$ and $\alpha \in \Gamma^h$. We may suppose $l$ to be the least number with this property. This implies $|\gamma_i| \leq |\gamma_k|$ for $i=0,\dots,l$. Now for each $i \in [0,h]$, we mark the last moment up to $k$ as well as the first moment beyond $k$ when the stack has height $|\gamma_l|+i$. More precisely, for each $i \in [0,h]$, let $f(i)$ be the greatest number in $[0,k]$ such that $|\gamma_{f(i)}| = |\gamma_l|+i$, and $g(i)$ be the least number in $[k,l]$ such that $|\gamma_{g(i)}| = |\gamma_l|+i$. Because of the assumption about $\delta$, the stack height changes by at most 1 at each step, and $f(i)$ and $g(i)$ are always defined. Because of the way $f$ and $g$ are defined, $f(i) < j < g(i)$ implies $|\gamma_j| > |\gamma_{f(i)}| = |\gamma_{g(i)}| = |\gamma_l| + i$. (Update April 14, 2016: Thanks to Johannes Osterholzer for spotting an error here in the original post, which is now corrected.) For $i \in [0,l]$, let $Z_i$ be the top symbol of $\gamma_i$. Since $h = (|Q| \cdot |\Gamma|)^2$, there must be $i,j \in [0,h]$ such that $i < j$ and \[ (q_{f(i)},Z_{f(i)}) = (q_{f(j)},Z_{f(j)}), \quad (q_{g(j)},Z_{g(j)}) = (q_{g(i)},Z_{g(i)}). \] This implies \[ \gamma_{f(i)} = \gamma_{g(i)},\quad \gamma_{f(j)} = \gamma_{g(j)} = \delta \gamma_{f(i)} \] for some $\delta \in \Gamma^+$. Let \[ x_1 = a_1 \dots a_{f(i)},\quad x_2 = a_{f(i)+1} \dots a_{f(j)},\quad x_3 = a_{f(j)+1} \dots a_{g(j)},\quad x_4 = a_{g(j)+1} \dots a_{g(i)}. \] Then we have \[ (q_{f(i)},\gamma_{f(i)}) \stackrel{x_2^n}{\Longrightarrow} (q_{f(i)},\delta^n\gamma_{f(i)}) \stackrel{x_3}{\Longrightarrow} (q_{g(j)},\delta^n\gamma_{f(i)}) \stackrel{x_4^n}{\Longrightarrow} (q_{g(j)},\gamma_{f(i)}) \] for all $n \geq 0$. Since $M$ never enters into an infinite loop, we must have $x_2 \neq \epsilon$. Let $w'$ be such that $w = a_1 \dots a_l w'$ and let $x_5 = a_{g(i)+1} \dots a_l w'$. It is now easy to verify the conditions 1–3 of the theorem.


Update (December 4, 2013). It has occurred to me that the difficulty with the machine-based approach lies in the proof of Ogden's lemma (what Harrison calls the "iteration theorem"). What I called "Harrison's theorem" above is really a weakened form corresponding to the usual pumping lemma. This paper by King says "Ogden [in his Ph.D. thesis] gave an iteration theorem for the family of deterministic context-free languages" and Harrison and Havel gave "a new proof of Ogden’s result for […] deterministic context-free languages." But this 1985 technical report by Harrison says:

"The key to giving a rigorous proof for deterministic iteration theorems is to use trees, not machines. This was a problem in [Ogden's Ph.D. thesis] where a deterministic iteration theorem was first expressed but not proved rigorously. It turns out to be easy to do with strict deterministic grammars." (page 9)
Problem 5 in Section 6.2 of Harrison's book asks for "a pda proof of the iteration theorem". Intriguingly, the next problem (Problem 6) is posed as a question "Using the result of Problem 5, can you find an iteration theorem for deterministic context-free languages?"

Correction (January 1, 2014). The statement of "Harrison's theorem" in the original post was in error and read as follows:

Theorem (Harrison). Let $L \subseteq \Sigma^{\ast}$ be a DCFL. There exists a positive integer $p$ satisfying the following property: for every $w \in L$ with $|w| \geq p$, there exist $x_1,x_2,x_3,x_4,x_5$ such that

  1. $w = x_1 x_2 x_3 x_4 x_5$;
  2. $x_2 x_4 \neq \epsilon$;
  3. for every $z \in \Sigma^{\ast}$ and $n \in \mathbb{N}$, $x_1 x_2 x_3 x_4 z \in L$ if and only if $x_1 x_2^n x_3 x_4^n z \in L$.
This is not even true; $L = \{\, w \in \{ a,b \}^{\ast} \mid \#_a(w) \geq \#_b(w) \,\}$ and $w = a^p$ constitute a counterexample. It seems I somehow managed to mix up the two versions of Harrison's iteration theorems that apply to two different classes of languages. The main result of my Amsterdam Colloquium paper is not affected, since it suffices to be able to find one $w \in L$ that satisfies the conditions of the theorem.

1 comment :

  1. Hello, this is fastidious post I actually loved reading this.Grammarly

    ReplyDelete