Tuesday, March 27, 2012

Complexity of Numbers and the Impact on Science

Numbers are fascinating, as the foundations of arithmetics gave rise to great puzzles in formal logic. Here I want to address some peculiarities of algorithmic information theory.

In case you didn't know: there are real numbers which are definable, but not computable. The prime example is Chaitin's constant Omega, which is roughly characterizing the probability that random constructed programs will halt. If you know the first N bits of Omega, one knows about the halting problem for all programs of size up to N bits. Non-computable numbers are directly related to the halting problem.

Consider the following hierarchy leading to the real numbers:
$\{0\} \subset \mathbb{Z} \subset \mathbb{Q}  \subset constructible \subset algebraic \subset computable \subset definable \subset \mathbb{R}$
and the complement of the above:
$\mathbb{R}/\{0\} \supset  \mathbb{R}/\mathbb{Z} \supset irrational  \supset non-constructible \supset transcendental \supset non-computable \supset non-definable $

Note that the definable numbers are countable (since every definition must consist of finite set of words in a specified formal language), whereas the non-definable numbers are uncountable, i.e. no isomorphism exists between the natural numbers and the non-definable numbers (see Cantor's diagonal argument). There are so many, because almost all sequences of digits  in the decimal representation are random, without any inherent complexity.

Real numbers that we can define in some way can all be represented by finite or infinite integer sequences. Representation in terms of integers has the advantage that  no arbitary base (like 10 or 2) has to be chosen as is the case for floating point numbers. All real numbers are actually uniquely defined by their continued fraction. Consider e.g. the continued fractions to characterize irrational numbers:
$x=[a_0; a_1, a_2,  ...] = a_0+\frac{1}{a_1+\frac{1}{a_2+\ldots}}$
which is finite for a rational number, and infinite for irrational numbers, such as for the golden ratio:
$\frac{1+\sqrt{5}}{2}=[1; 1, 1, 1, ...]$

Of course, the sequence of rationals obtained by aborting the continued fraction after 1, 2, 3, 4, ... iterations will yield a Cauchy sequence which equally represents the irrational number.

If we are interested in the complexity of the structure of a computable number (where all coefficients of the contiued fractions can be computed) we may address it via the compressibility of its coefficients in terms of algorithms. For the golden ratio, the compression is maximal as $\forall i\in \mathbb{N}: a_i=1$. For $\pi=[3;7,15,1,292,1,1,\ldots]$, a simple program can compute its coefficents as well. Although the coefficients of the continued fraction does not obey any evident rule, there are several generalized continued fractions (those are not uniquely defined) which do, e.g. $\pi=3+\frac{1^2}{6+\frac{3^2}{6+\frac{5^2}{6+\ldots}}}$.
I already mentioned that almost all irrational numbers (those that are not computable) are random, with no apparent pattern valid for the whole series. This fact indeed implies a statistical property for the continued fraction coefficients for $x$: their geometric mean is a constant known as Khinchin's constant $K_0$, which is independent of the value $x$:
for almost all $x\in\mathbb{R}$, $\lim_{n\rightarrow \infty}(\prod_{i+1}^n a_i)^{\frac{1}{n}}=K_0\equiv \prod_{n=1}^{\infty}(1+\frac{1}{n(n+2)})^{log(n)/log(2)}$

However, it is not clear whether the geometric mean for all computable number take values different from $K_0$; the geometric mean e.g. for $\pi$ seems to converge towards $K_0$.

The upshot of all this should be: The less compressible a real number is, the higher is its complexity. For example, Chaitin's number Omega is incompressible: since it is not computable, one cannot write an algorithm shorter than N bit to compute the first N binary digits of Omega.

A side remark: Chaitin's constant is an arithmetical number, that is we can define it via first order arithmetic (it is in the truth set of first order arithmetic). One can also define real numbers in second order arithmetic (called analytic numbers), I assume they are also countable, but I am not sure. (If you know please leave a comment.) There are however also super-computable numbers, numbers that you could compute if the halting problem could be solved in a finite time (i.e. Turing machines, that would run forever in the real world are assumed to be finished after a fixed duration). It is quite possible - but I don't know as I am not an expert - that there is a hierarchy of computability, supercomputability, supersupercomputability and so on, leading to higher and higher notions of complexity.

But let us come back to the question of the complexity of computable numbers. Assume you don't know any algorithm that computes $x$, and you would have to judge its complexity by mere inspection of the representation at hand: digits, or continued fractions or what you have. For this purpose you could use the Kolmogorov complexity, which determines the algorithmic entropy of your expression, relative to a fixed universal description language (like a programming language). If you have a string $s$, and a descripion $d(s)$, the Kolmogorov complexity is the length of the description: $K(s)=|d(s)|$. The remarkable thing is that whatever universal description languages  $L_1$ you choose, and other description languages  $L_2$ gives in principle the same result, up to a constant independent of $s$: $\forall s: |K_1(s)-K_2(s)|<c$
However, $K(s)$ is not a computable function, i.e. given $s$, there is no algorithm that can compute $K(s)$. The proof by contradiction is very similar to the Berry paradox, a paradox about self-reference: the description "the smallest possible integer not definable by a less than 100 words" is supposed to denote a specific integer, but given that the description has itself less than 100 words, it cannot denote this integer.


The result is: it is not possible to tell which is the smallest description for a given expression.

This is bad news, but for some interesting cases we can say something about the minimal description. It is sufficient to address integers only: All definable real numbers can be mapped on the integers (e.g. mapping the Turing machine that computes the number on its Goedel number). Hence, instead of encoding the real numbers, we can think about ways to encode large integer numbers. The conventional thing is to use the decimal or binary system. But for large integers a more compact definition can be given in terms of its prime factors. And large prime numbers $p$ can be referred to as the $k$-th prime number, with $k$ much smaller than $p$. Of course, such a description has the disadvantage that to decode it, a computer has to compute for quite a long time. This is at the heart of cryptology. But the description is short, and this is the only thing that matters here. After all, we are not concerned with the complexity of the computation (the big-Oh-complexity, quantifying how fast a program will run), but descriptive complexity.

Most algorithms apply some kind of recursion: a piece of code is repeated over and over again, until the program halts (or it runs forever, increasing the accuracy of the computation). A good example is $k!$ or the $k$-th prime number, calculated via the sieve of Erasthothenes. In such cases, the complexity of the description is given by the piece of code that is iterated. If we have an integer $k$ as an input, then we can describe the output $f^n(k)$. by  a triple $(k,f,n)$. Hence the complexity is less or equal the complexity of the sum of each ingredients: $K(f^n(k)) \leq K(k)+K(f)+K(n)$. But assume we have to analyse the output $j=K(f^n(k))$ without knowing what algorithm $f$ did produce it. Then, in many cases, the apparent complexity is much higher, $K(j)>K(f^n(k))$.

Another class of computations are integers sequences of finite or infinite length, such as Fibonacci or the ordered set of all prime numbers. Integers are sequences of length 1, rationals of length 2 (nominator and denominator), algebraic numbers $x$ are of length $k$ with $k$ the degree of the polynomial of which $x$ is a root, transcendental numbers are infinite integer sequences (interpreted e.g. as fractions) called Cauchy series. In standard analysis, such a series (of fractions) can have a supremum which is not within the series. And this supremum can even be non-computable, as illustrated by the Specker sequence.

Note that from a constructivist point of view in the philosophy of mathematics, one can also restrict analysis to the computable numbers, as they form a closed field. I sympathize with this view, not because I am opposed to the supremum (which has to be postulated to always exist), but because I don't believe that uncountable sets are indispensable in natural science. The Cantor diagonal argument does not prove that there are uncountably many real numbers, only that there is at least one which was not numerated in any enumeration. In the constructivist view the Cantor set is subcountable. What does that mean? Recall that the continuum hypothesis is logically independent of the ZFC (Zermelo-Fraenkel set theory with the axiom of choice). If one restricts to the computable numbers from the beginning (in the language of formal logic: restrict to the constructible universe) by invoking the axiom of constructibility $V=L$, then one obtains the (generalized) continuum hypothesis and the axiom of choice (and more) for free, just because it is the smallest universe consistent with ZF, and there is no room for anything else. In this constructible set theory (CZF), all sets are subcountable, which means that there is a partial surjection from the natural numbers onto it. The reason that in conventional analysis one takes all these Cauchy sequence (their equivalence classes) to describe real numbers is convenience, but it inflates analysis compared to what is actually needed. The rational numbers or if you like the computable numbers are completed to give the real numbers by demanding that all Cauchy sequences converge. The Specker sequence is a cauchy sequence, but according to constructible analsysis, the limit is not in the universe. 


In physics, there are also many examples of integer sequences, such as those arising in perturbative expansions. The question is whether everything science is about can be casted via integer sequences.

I assume that this is the case, that nature might as well be digital. I will address this issue (Wheeler's: "It from bit") another time. If Nature is about certain classes of integer sequences, Mathematics is about theorems between specific classes of integer series.

I believe that iteration - the repeated application of a rule - is at the heart of complexity in Nature, since iterations in computer science are in close analogy to physical dynamical systems. Complex patterns such as the Mandelbrot set or other fractals, or the output of cellular automata, can often be generated by simple algorithms, illustrating that the complexity of the algorithm is much less than what it seems to be by inspection of the data set it produces.

In order to distinguish the complexity of emerging patterns from the algorithmic complexity, I want to call it the Iterative Complexity in order to emphasize the role of iteration. It is not important here whether an algorithm is implemented by iteration or recursion, it is not even important to know the complexity of algorithm. Iterative complexity shall be a measure on the growth of complexity in a pattern by each iteration.

Sometimes, like in fractals such as the Sierpinski triangle, the iteration at work is apparent immediately. It is simple to recognize the pattern being best characterized as applying a rule of putting a triangle hole into each triangle. Such fractals are called regular, and they contain redundant information. For the Mandelbrot set, the iteration at work is much more difficult to see. In particular, the pattern is no longer self-similar in the strict sense and no information can be identified as redundant (apart from the reflection symmetry along the real axis).

These conceptual questions on the nature of complexity can be addressed without acknowleding the quantum aspects of nature. My hope is that the deterministic chaos can be understood via the notion of iterative complexity (still needed to be made precise) as well. Of course, the details become extremely involved as soon as the values of initial parameters matter, as is the case e.g. for the Lorenz attractor. In practice, since we can only measure parameters with limited precision, we will be unable to calculate the consequences, even if we could perform the calculation itself with any desired precision. A prime example is the three-body problem. However, it is not clear to me whether in a deterministic setup there can be randomness in nature (in contrast to the non-determinism in quantum mechanis: deterministic randomness expressed in non-computability of patterns), or whether the randomness we find in chaotic systems is always due to our ignorance about initial conditions, and there is room for a Laplacian demon after all.

No comments: