Dec 20, 2010

Incompleteness via Tarski's Theorem

Many textbooks, including the one I'm using, present an "easy" proof of Gödel's First Incompleteness Theorem via Tarski's Theorem on undefinability of truth, before presenting a proof more in line with Gödel's original proof. Tarski's Theorem (or a version thereof) states that the set TR of Gödel numbers of sentences of first-order arithmetic that are true in the standard model is not "arithmetical", i.e., is not definable by any formula of first-order arithmetic. In contrast, the set PV of Gödel numbers of sentences provable in any recursively axiomatized theory is arithmetical and has a Σ1 defining formula. Therefore, this proof goes, the two sets can't be equal, and there must exist a true sentence that is not provable, unless the theory proves some false sentence.

In my logic class, I thought of mentioning the proof via Tarski's Thorem before going into any technicalities about arithmetization. I was going to actually prove Tarski's Theorem and only state that PV is arithmetical, leaving the proof of the latter fact for the next few lectures.

But then I realized the problem with this approach — Tarski's Theorem is not that easy to prove. The proof given in Keisler et al.'s textbook uses (a weak version of) the Fixed Point Theorem, which depends on the fact that the "diagonal function" is arithmetical, which in turn relies on the fact that all computable functions are arithmetical, and ultimately rests on the existence of the Gödel beta function. In other words, this proof involves almost all machinery used in Gödel's original proof.

It is possible to prove the arithmetical definability (or representability) of the diagonal function (or some such devices needed to prove Tarski's Theorem) without using the beta function, as is done in Boolos and Jeffrey's book (at least until the third edition), but this depends on the details of the Gödel numbering of formulas, and does not work with the method used in Keisler et al.'s book.

So, rather than prove Tarski's Theorem, what I did in my class was to prove the arithmetical undefinability of the "satisfaction relation"

SAT = { (#(A(x)), n) | A(n) is true },

which holds of all pairs whose first component is the Gödel number of a formula and whose second component is a number that the formula is true of in the standard model. (Here, n is the numeral representing the natural number n.) This has an easy three-line proof, does not require the diagonal function on natural numbers, and, as noted here, makes no assumption on the nature of the Gödel numbering (except injectivity) or on the vocabulary from which formulas are built.

To prove the incompleteness using the undefinability of SAT, all you need is to prove the arithmetical definability of the "provable satisfaction" relation

PV2 = { (#(A(x)), n) | A(n) is provable },

which, of course, hardly requires any more work than showing the arithmetical definability of the set PV of Gödel numbers of provable sentences. If no false sentence is provable, then PV2 must be a proper subset of SAT, and it follows that there is a true sentence that is not provable.

This is of course a rather trivial point, but I find it a little curious that the undefinability of SAT does not seem to be mentioned very often, not even in a book like Smullyan's Gödel's Incompleteness Theorems.

May 22, 2010

Labeled Bracketings and the Chomsky-Schützenberger Theorem

I always thought that the Chomsky-Schützenberger Theorem is an easy consequence of the fact that for every CFG G, the strings generated by G are precisely the yields of the parse trees of G. Until recently, I did not have a good understanding of just how easy a consequence it is.

The theorem is usually stated as follows:

Theorem (Chomsky-Schützenberger). For every context-free language L, there exist a positive integer n, a homomorphism h, and a regular set R such that

L = h(DnR),

where Dn is the Dyck language over the set Δn of n pairs of parentheses.

In fact, the theorem can be strengthened to require that the set R be local in the sense that there are sets J1, J2 ⊆ Δn and I ⊆ Δn2 such that

R = (J1 Δn* ∩ Δn* J2) − Δn*n2I) Δn*.

This strengthening is often not mentioned in textbooks; it is explicitly stated in Chomsky's (1963) "Formal properties of grammars". Another fact mentioned by Chomsky (1963) is that Dn may be replaced with the set Dn of Dyck primes, which are elements of Dn whose first and last symbols form a matching pair of parentheses. (Also, the homomorphism h is in fact alphabetic in the sense that it maps each symbol in Δn either to a symbol or to the empty string, but this is always clear from the proof.)

I had thought that the set DnR in the statement of the theorem can just be taken to be an alternative representation of the set PT(G) of parse trees of any CFG G for L. This is essentially correct, but not literally true. Chomsky expressed the qualification as restrictions on the rules of G (see below).

Proofs of the theorem given in some textbooks (e.g., Salomaa 1972 and Berstel 1979) fail to make any connection between DnR and PT(G), thus bypassing a main concern for Chomsky. The proof in Kozen's (1997) Automata and Computability—perhaps the clearest textbook treatment of the theorem—makes a clear connection between the two sets without explicitly mentioning it. (Another easy line of approach to the theorem, via pushdown automata (e.g., Harrison 1978) only makes an indirect connection between the two sets, through PDA computations.)

Here's how I came to view the connection. Recall that a set L of trees over some finite alphabet Γ of labels is local if there are a set J ⊆ Γ and a finite set I ⊆ Γ+ such that any tree T is in L if and only if it satisfies the following conditions:

  • the label of the root is in J, and
  • for every node, the string obtained by concatenating its label with the labels of its children is in I.
As is well-known, the set PT(G) is local for any CFG G.

Let τ be the obvious translation that maps a tree to its labeled bracketing representation—that is to say, if T is a tree with root label σ and immediate subtrees T1, …, Tn, then τ(T) = [σ τ(T1) … τ(Tn) ]σ. If T is a tree over Γ with |Γ| = n, then τ(T) is an element of Dn, where each pair of parentheses in Δn corresponds to a symbol in Γ. Let us call a set L of trees over Γ super-local if τ(L) = DnR for some local set R ⊆ Δn*. It is clear that a set L of trees is super-local if there are sets J1, J2 ⊆ Γ, and I1, I2, I3 ⊆ Γ2 such that for every tree T, T is in L if and only if it satisfies the following conditions:

  • the label of the root is in J1,
  • the label of every leaf is in J2,
  • if a node labeled by Y is the first child of a node labeled by X, then XYI1,
  • if a node labeled by Y is the last child of a node labeled by X, then YXI2, and
  • if a node labeled by X and a node labeled by Y are adjacent siblings (in this order), then XYI3.

Clearly, any super-local set of trees of bounded out-degree must be local, but it is easy to see that there are local sets, including sets of the form PT(G), that are not super-local. The set of "rule trees" of a CFG, where a rule, rather than a nonterminal, labels each node, need not be super-local either.

The strengthened form of the Chomsky-Schützenberger Theorem is an immediate consequence of the following easy lemma:

Lemma. If L is a local set of trees over Γ, then there exist a super-local set L′ over Γ′ and a relabeling g: Γ′ → Γ such that g maps L′ onto L.

The idea is to add to the label of each node information about the labels of all its siblings. This construction can be seen to underly Kozen's (1997) proof of the Chomsky-Schützenberger Theorem. While a "rule tree" of a CFG labels each node by the rule expanding it, the tree used by Kozen puts information about the rule on the children of the node expanded by the rule. It is in fact sufficient for each child to be labeled by the right-hand side of the rule expanding its parent (plus its position among its siblings).

Proof. Let J ⊆ Γ, I ⊆ Γ+ be the finite sets witnessing the locality of L. Let

Γ′ = { [X]1 | XJ } ∪ { [α]i | X ∈ Γ, Xα ∈ I, 1 ≤ i ≤ |α| },

and set

J1 = { [X]1 | XJ },
J2 = { [X1Xl]i | XiI },
I1 = { [X1Xl]i [Y1Ym]1 | Xi Y1YmI, m ≥ 1 },
I2 = { [Y1Ym]m [X1Xl]i | Xi Y1YmI, m ≥ 1 },
I3 = { [X1Xl]i [X1Xl]i+1 | 1 ≤ i < l}.

Let L′ be the super-local set determined by J1, J2, I1, I2, I3, and let

g([X1Xl]i) = Xi.

Now it is clear that g gives a one-one correspondence between the trees in L′ and the trees in L.

Note that since the conditions given by I1, I2, I3 in the above proof are not independent, the definition of either I1 or I2 (but not both) may be changed to drop the part that says XiY1YmI, without thereby affecting the resulting super-local set L′. If we wish to have τ(L′) = DnR, rather than τ(L′) = DnR, that's possible too by changing the label of the root of every tree to a special symbol that only appears at the root.

In the above proof, each node label in a tree from L is changed to the sequence of original labels of all its siblings, including itself, plus its position among its siblings. Clearly, there are more economical ways of adding information to each node label to create a super-local set. For instance, we can use labels of the form XiXm consisting of the node's original label followed by the original labels of all its right siblings.

Let us call a tree language super-regular if it can be expressed in the form τ-1(DmR) with R regular. Every super-regular tree language is regular, since it can be recognized by a "left-to-right" one-visit tree-walking automaton (cf. Okhotin, Salomaa, and Domaratzki 2002). The converse is of course not true, and the super-regular tree languages don't even contain all the local tree languages. Here is an example of a Chomsky normal form CFG G such that PT(G) is not super-regular. This shows that the set DnR in the Chomsky-Schützenberger Theorem, even in its weak form, cannot in general be taken to be τ(PT(G)), unless the grammar G satisfies some special conditions.

SS A
SB S
Sa
Aa
Ba

That the set of parse trees of this grammar is not super-regular can be seen as follows. Consider strings of the form

(*) [S x1 … [S xk [S [a ]a ]S yk ]Sy1 ]S, with xi ∈ { [A [a ]a ]A, ε } and yi ∈ { [B [a ]a ]B, ε } (i = 1, … k).

Any string of the form (*) is in τ(PT(G)) if and only if for all i = 1, … k, either

xi = [A [a ]a ]A and yi = ε

or

xi = ε and yi = [B [a ]a ]B.

This means that the 2k strings of the form

(**) [S x1 … [S xk, with xi ∈ { [A [a ]a ]A, ε } (i = 1, … k)

are such that no two of them stand in the Myhill-Nerode right-congruence relation ≡τ(PT(G)) associated with τ(PT(G)).

Now let D4 be the set of Dyck primes over { [S,]S,[A,]A,[B,]B,[a,]a }, and R be any regular set of strings. Clearly, all strings of the form (**) stand in the relation ≡D4 to each other. Since R is regular, ≡R has only finitely many equivalence classes. So if we choose large enough k, there are distinct strings w1, w2 of the form (**) that stand in the relation ≡R. Since for any string languages L1 and L2, ≡L1L2 is a coarser equivalence relation than ≡L1 ∩ ≡L2, the strings w1, w2 must stand in the relation ≡D4R. Therefore, we cannot have D4R = τ(PT(G)) for any regular set R. Since τ(PT(G)) = D4R is implied by τ(PT(G)) = D4R, we cannot have the latter, either, for any regular R.

Update January 18, 2012: A similar proof appears in Libkin 2006.

When is PT(G) super-regular? Chomsky's (1963) proposed sufficient condition was that G be a modified normal grammar in the sense that G is in Chomsky normal form and contains no pair of rules

AB C
DC E

for any nonterminals A, B, C, D, E. However, Chomsky's proof that (in our words) PT(G) is super-regular for any modified normal grammar G does not work. Adding a further condition that says that if

AB C
AD E
FB E

are rules of the grammar, then so is

AB E

guarantees that PT(G) is super-local. I have so far been unable to tell whether there is any modified normal grammar G for which PT(G) is not super-regular.

Mar 26, 2010

Mathematical Logic and Computability

I'll be teaching logic to graduate students in philosophy this coming semester. The textbook I'll use is H. Jerome Keisler et al.'s (1996) Mathematical Logic and Computability. Despite the fact that it has many typos, I like this book. It uses tableaus for proofs and register machines for computation. It contains a detailed treatment of Gödel's Incompleteness Theorems. The book's level is definitely more adequate for my purposes than Boolos and Jeffrey's Computability and Logic. I'll see how much of the book I can cover in two semesters.

The book is out of print, but I found a PDF version at the following URL:

http://www.cs.mum.edu/courses/corazza/LogicAndComputability.pdf

This version is different from the older version that shows up in Google Scholar. It's not exactly the same as the printed book, but it's very close. (The title page of the PDF version lists Paul Corazza, H. Jerome Keisler, Kenneth Kunen, Terrence Millar, Arnold Miller, and Joel Robbin as the authors.)

The review by Leon Harkleroad states that the definition of a recursively enumerable set in an exercise to Chapter 5 is incorrect. I see nothing wrong with the definition, which reads:

A subset AN is recursively enumerable (or r.e.) if A = ∅ or A is the range of a total computable function

The book comes with a software package designed for DOS and Windows. In the words of the authors, "The programs are copyrighted by the authors of this book and are part of the public domain." I can't find the software anywhere on the web, but fortunately I still keep the diskette that came with my copy of the book. I was able to run it on my Mac using CrossOver Mac. With WineBottler I had less success.

Mar 2, 2010

Kit Fine's situation semantics

Kit Fine gave an invited talk at the Third Interdisciplinary Ontology Conference on Sunday, titled "A State-Based Approach to the Frame Problem". He gives a kind of situation semantics to the classical propositional language, and then extends it to a hybrid modal language for reasoning about change.

A model for the propositional language is an ordered triple (S,≤,[ ]), where (S,≤) is a partially ordered set of states (his term for partial situations), subject to certain conditions, and [ ] is a valuation function assigning each propositional variable a pair ([p]+,[p]-) of sets of verifying and falsifying states, subject to certain natural conditions. The recursive clauses for ∧ and ∨ are interesting:

s verifies B ∧ C if for some t and u, t verifies B, u verifies C, and s = t+u (the l.u.b of {t,u}).
s falsifies B ∧ C if s falsifies B or s falsifies C or for some t and u, t falsifies B, u falsifies C, and s = t+u.

s verifies B ∨ C if s verifies B or s verifies C or for some t and u, t verifies B, u verifies C, and s = t+u.
s falsifies B ∨ C if for some t and u, t falsifies B, u falsifies C, and s = t+u.

Let (S,≤) be the upper semi-lattice generated by three states s1, s2, s3. Then (S,≤) is a state space according to Fine's definition. Let p1, p2, p3 be propositional variables, and for each i = 1,2,3, let

[pi]+ = { si }, [pi]- = ∅.

Then (S,≤,[ ]) is a model. The formula (p1 ∧ p2) ∨ p3 is verified by the states s1+s2, s3, and s1+s2+s3, but not by any other states. This means that the recursive clauses do not preserve the following condition on the sets of verifying and falsifying states of a formula:

s verifies A, u verifies A, and s ≤ t ≤ u imply t verifies A.
s falsifies A, u falsifies A, and s ≤ t ≤ u imply t falsifies A.