Thursday 20 December 2012

Independence of B\(\Sigma_2\) from I\(\Sigma_1\)


One thing is to prove a certain statement \( A \), another thing completely different is to prove that \( A \) is not provable! How should one go about proving something like this? By the soundness theorem (if \( A \) is provable in a theory T then \( A \) is valid in all model of T), it suffices to produce a model where \( A \) is not valid. And that's what people will normally do.

But can one answer such questions using proof-theoretic methods? In theory one can. You assume a proof of \( A \) exists. Apply a proof translation or proof interpretation to the imaginary proof. The soundness of the translation or interpretation guarantees the existence of some information or result, which, if one can show not to exist, contradicts the fact that a proof of \( A \) existed in the first place.

And that's precisely what Charles Parsons did in [1], in order to show that \( \Sigma_2 \)-Bounded Collection is not provable from \( \Sigma_1 \)-Induction. Parsons' proof uses Herbrand theorem and the recursion theorem from the previous post. Unfortunately, the proof is not only quite complicated, but I think it contains a bug (don't quote me on that though).

As a fan of proof theory I would love to see a nice and simple proof-theoretic argument proving this interesting result, some form of realizability or dialectica, but for now I will present a model theoretic proof given by Paris and Kirby in [2].

For the record, \( \Sigma_2 \)-Bounded Collection is the principle $$ \mbox{B\( \Sigma_2 \)} \quad \colon \quad \forall x < y \exists z A(x, z) \rightarrow \exists t \forall x < y \exists z < t A(x, z) $$ where \( A(x,y) \) is a \( \Sigma_2 \) formula. Because the implication from right to left is trivial, one can view the principle as stating the equivalence $$ \forall x < y \exists z A(x, z) \leftrightarrow \exists t \forall x < y \exists z < t A(x, z) $$ stating that unbounded existential quantifiers can in some sense be ``prenexed" over bounded universal quantifiers. Moreover, as we are in the classical setting the same can be said about unbounded universal quantifiers and bounded existential quantifiers $$ \exists x < y \forall z A(x, z) \leftrightarrow \forall t \exists x < y \forall z < t A(x, z). $$Theorem. There is a model of I\( \Sigma_1 \) which is not a model of B\( \Sigma_2 \).

The proof goes as follows. Start with any model \( M \) of Peano arithmetic containing non-standard \( \Sigma_2 \)-definable elements. They proceed as follows:

(1) Let \( K = \{ x \in M \; | \; \mbox{ \(x\) is \(\Sigma_2\)-definable in } M \} \)

(2) Notice that if \( A \) is \( \Sigma_2 \) then \( M \models A \) implies \( K \models A \).

(3) Clearly \( K \models \mbox{I}\Sigma_1 \), as \( K \) is a model of the least number principle for \( \Pi_1 \) formulas. (Side remark: Show that \( K \not\models \mbox{I}\Sigma_2 \) )

Let us show that \( K \not\models \mbox{B}\Sigma_2 \) by producing a concrete instance of \( {B}\Sigma_2 \) which does not hold in \( K \).

(4) Let \( \exists w A(e, w, x) \) be a universal \(\Sigma_2\)-formula, with \( A \in \Pi_1 \). Moreover, let \( t \in K \) be non-standard.

(5) For any \( a \in K \) we clearly have $$ M \models \exists e < t \exists u ( \underbrace{A(e, u_0, u_1) \wedge u_1 = a \wedge \forall z < u \neg A(e, z_0, z_1)}_{(*)}). $$ That's to say, in the model \( M \) we can find standard codes \( e \) representing \( \Sigma_2\)-formulas that uniquely define each \( a \in K \).

(6) Let \( B(e, u, a) = \exists x \forall y B_0(e, u, a, x, y) \) be a \( \Sigma_2 \) formula provably equivalent in Peano arithmetic to \( (*) \). Hence $$ M \models \exists e < t \exists u B(e, u, a).$$ (7) Note that \( \exists u B(e, u, a_0) \wedge \exists u B(e, u, a_1) \rightarrow a_0 = a_1 \).

(8) Hence, by (2), we also have $$ K \models \exists e < t \exists u B(e, u, a) $$ and in particular $$ K \models \forall a < t+1 \exists e < t \exists u \exists x \forall y B_0(e, u, a, x, y) $$ We show now that $$ K \not\models \exists r \exists s \forall a < t + 1 \exists e < t \exists u < r \exists x < s \forall y B_0(e, u, a, x, y). $$ obtaining an instance of \( B\Sigma_2 \) which does not hold in \( K \).

(9) Assume $$ K \models \exists r \exists s \exists e < t \exists u < r \exists x < s \forall y B_0(e, u, a, x, y) $$ and consequently $$ K \models \exists r \forall p \exists s \exists e < t \exists u < r \exists x < s \forall y < p B_0(e, u, a, x, y). $$ (10)  Hence, by (2) again, $$ M \models \forall a < t+1 \exists r \forall p \exists s \exists e < t \exists u < r \exists x < s \forall y < p B_0(e, u, a, x, y) $$ which is equivalent to $$ M \models \forall a < t+1 \exists e < t \exists u B(e, u, a) $$ which is a contradiction by (7) and the pigeon-hole principle in \( M \).

So that concludes the proof. I must add that the same argument works to show the independence of B\(\Sigma_{n+1}\) from I\(\Sigma_n\), one must simply adjust all the indexes above accordingly (and that's what Paris and Kirby have in [2]).

Anyway, as I said, I would love to see a nice and short proof as the one above using proof theoretic methods instead.

[1] Charles Parsons, On a number theoretic choice schema and its relation to induction, In: Intuitionism and Proof Theory (Proc. Conf., Buffalo, N.Y.,1968) (1970), pp. 459-473.

[2] J. B. Paris and L.A.S. Kirby, \( \Sigma_n \) collection schemas in arithmetic. Logic Colloquium, pp. 199-209, 1977.

Tuesday 18 December 2012

Kleene's second recursion theorem

You will need to know the basics of partial recursive functions for this one, but believe me, it's worth the effort. If \( e \) is a number, we use the standard notation \( \{e\} \) for the partial function whose code is \( e \). The theorem says the following:


Theorem A. For any total computable function \( F \) there exist an \( e \) such that \( \{ e \} \simeq \{ F e \} \).

Just think about it. \( F \) can be any total computable function! Take \( F(x) = x+1 \), for instance. That's certainly computable. The theorem says that there exists two successive natural numbers \( e \) and \( e + 1 \) that code the same function, i.e. \( \{e\} \simeq \{ e+1 \} \) . Isn't that just amazing?

The crucial idea of the proof is self-application, i.e. the fact that we can apply a function to its own code as \( \{e\}(e) \). Moreover, that this self-application is itself computable, i.e. there is a computable function \( h \) such that \( \{ h(e) \} \simeq \{ \{e\}(e) \} \). 
The proof of the theorem goes as follows:

(1) Assume a total computable function \( F \) is given.
(2) For any \( x \) let \( h (x) \) be a code such that \( \{ h(x) \} \simeq \{ \{x\}(x) \} \). Such total and computable function \( h \) exists.
(3) Let \( k \) be such that \( \{k\}(x) \simeq F(h(x)) \). Such \( k \) exists because both \( F \) and \( h \) are computable.
(4) We claim that \( e = h(k) \) does the job!
(5) By (2) and (4) we have \( \{e\}(x) \simeq \{h(k)\}(x) \simeq \{\{k\}(k)\}(x) \).
(6) But by (3) we also have \( \{\{k\}(k)\}(x) \simeq \{F(h(k)\}(x) \simeq \{F(e)\}(x) \), so we are done.

An absolute gem, don't you think!?

If you also know about the lambda calculus, you can have the following variant of the theorem:

Theorem B. For any lambda term \( t \) there exist a lambda term \( s \) such that \( t s = s \).

So, this essentially says that any lambda term has a fixed point!

The proof of this variant follows the structure of the original proof:

(1) Let a lambda term \( t \) be given.
(2) Let \( h = \lambda x . x x \).
(3) Let \( k = \lambda x . t(h(x)) \). 
(4) We claim that \( s = h(k) \) does the job.
(5) By (2) and (4) we have \( s = h(k) = k k \).
(6) But by (3) we also have \( k k = t(h(k)) = t s \), so we are done.

Note that \( k = \lambda x . t ( x x ) \). Hence \( s = k k = ( \lambda x . t ( x x ) ) ( \lambda x . t ( x x ) )    \) 

Therefore, the combination of \( h \) and \( k \) namely 
\[ \phi(f) = ( \lambda x . f ( x x ) ) ( \lambda x . f ( x x ) )  \]
in fact gives a fixed point operator: a functional \( \phi \) that calculates the fixed point of any given function \( f \).