Tuesday, March 27, 2012

Complexity of Numbers and the Impact on Science

Numbers are fascinating, as the foundations of arithmetics gave rise to great puzzles in formal logic. Here I want to address some peculiarities of algorithmic information theory.

In case you didn't know: there are real numbers which are definable, but not computable. The prime example is Chaitin's constant Omega, which is roughly characterizing the probability that random constructed programs will halt. If you know the first N bits of Omega, one knows about the halting problem for all programs of size up to N bits. Non-computable numbers are directly related to the halting problem.

Consider the following hierarchy leading to the real numbers:
$\{0\} \subset \mathbb{Z} \subset \mathbb{Q}  \subset constructible \subset algebraic \subset computable \subset definable \subset \mathbb{R}$
and the complement of the above:
$\mathbb{R}/\{0\} \supset  \mathbb{R}/\mathbb{Z} \supset irrational  \supset non-constructible \supset transcendental \supset non-computable \supset non-definable $

Note that the definable numbers are countable (since every definition must consist of finite set of words in a specified formal language), whereas the non-definable numbers are uncountable, i.e. no isomorphism exists between the natural numbers and the non-definable numbers (see Cantor's diagonal argument). There are so many, because almost all sequences of digits  in the decimal representation are random, without any inherent complexity.

Real numbers that we can define in some way can all be represented by finite or infinite integer sequences. Representation in terms of integers has the advantage that  no arbitary base (like 10 or 2) has to be chosen as is the case for floating point numbers. All real numbers are actually uniquely defined by their continued fraction. Consider e.g. the continued fractions to characterize irrational numbers:
$x=[a_0; a_1, a_2,  ...] = a_0+\frac{1}{a_1+\frac{1}{a_2+\ldots}}$
which is finite for a rational number, and infinite for irrational numbers, such as for the golden ratio:
$\frac{1+\sqrt{5}}{2}=[1; 1, 1, 1, ...]$

Of course, the sequence of rationals obtained by aborting the continued fraction after 1, 2, 3, 4, ... iterations will yield a Cauchy sequence which equally represents the irrational number.

If we are interested in the complexity of the structure of a computable number (where all coefficients of the contiued fractions can be computed) we may address it via the compressibility of its coefficients in terms of algorithms. For the golden ratio, the compression is maximal as $\forall i\in \mathbb{N}: a_i=1$. For $\pi=[3;7,15,1,292,1,1,\ldots]$, a simple program can compute its coefficents as well. Although the coefficients of the continued fraction does not obey any evident rule, there are several generalized continued fractions (those are not uniquely defined) which do, e.g. $\pi=3+\frac{1^2}{6+\frac{3^2}{6+\frac{5^2}{6+\ldots}}}$.
I already mentioned that almost all irrational numbers (those that are not computable) are random, with no apparent pattern valid for the whole series. This fact indeed implies a statistical property for the continued fraction coefficients for $x$: their geometric mean is a constant known as Khinchin's constant $K_0$, which is independent of the value $x$:
for almost all $x\in\mathbb{R}$, $\lim_{n\rightarrow \infty}(\prod_{i+1}^n a_i)^{\frac{1}{n}}=K_0\equiv \prod_{n=1}^{\infty}(1+\frac{1}{n(n+2)})^{log(n)/log(2)}$

However, it is not clear whether the geometric mean for all computable number take values different from $K_0$; the geometric mean e.g. for $\pi$ seems to converge towards $K_0$.

The upshot of all this should be: The less compressible a real number is, the higher is its complexity. For example, Chaitin's number Omega is incompressible: since it is not computable, one cannot write an algorithm shorter than N bit to compute the first N binary digits of Omega.

A side remark: Chaitin's constant is an arithmetical number, that is we can define it via first order arithmetic (it is in the truth set of first order arithmetic). One can also define real numbers in second order arithmetic (called analytic numbers), I assume they are also countable, but I am not sure. (If you know please leave a comment.) There are however also super-computable numbers, numbers that you could compute if the halting problem could be solved in a finite time (i.e. Turing machines, that would run forever in the real world are assumed to be finished after a fixed duration). It is quite possible - but I don't know as I am not an expert - that there is a hierarchy of computability, supercomputability, supersupercomputability and so on, leading to higher and higher notions of complexity.

But let us come back to the question of the complexity of computable numbers. Assume you don't know any algorithm that computes $x$, and you would have to judge its complexity by mere inspection of the representation at hand: digits, or continued fractions or what you have. For this purpose you could use the Kolmogorov complexity, which determines the algorithmic entropy of your expression, relative to a fixed universal description language (like a programming language). If you have a string $s$, and a descripion $d(s)$, the Kolmogorov complexity is the length of the description: $K(s)=|d(s)|$. The remarkable thing is that whatever universal description languages  $L_1$ you choose, and other description languages  $L_2$ gives in principle the same result, up to a constant independent of $s$: $\forall s: |K_1(s)-K_2(s)|<c$
However, $K(s)$ is not a computable function, i.e. given $s$, there is no algorithm that can compute $K(s)$. The proof by contradiction is very similar to the Berry paradox, a paradox about self-reference: the description "the smallest possible integer not definable by a less than 100 words" is supposed to denote a specific integer, but given that the description has itself less than 100 words, it cannot denote this integer.


The result is: it is not possible to tell which is the smallest description for a given expression.

This is bad news, but for some interesting cases we can say something about the minimal description. It is sufficient to address integers only: All definable real numbers can be mapped on the integers (e.g. mapping the Turing machine that computes the number on its Goedel number). Hence, instead of encoding the real numbers, we can think about ways to encode large integer numbers. The conventional thing is to use the decimal or binary system. But for large integers a more compact definition can be given in terms of its prime factors. And large prime numbers $p$ can be referred to as the $k$-th prime number, with $k$ much smaller than $p$. Of course, such a description has the disadvantage that to decode it, a computer has to compute for quite a long time. This is at the heart of cryptology. But the description is short, and this is the only thing that matters here. After all, we are not concerned with the complexity of the computation (the big-Oh-complexity, quantifying how fast a program will run), but descriptive complexity.

Most algorithms apply some kind of recursion: a piece of code is repeated over and over again, until the program halts (or it runs forever, increasing the accuracy of the computation). A good example is $k!$ or the $k$-th prime number, calculated via the sieve of Erasthothenes. In such cases, the complexity of the description is given by the piece of code that is iterated. If we have an integer $k$ as an input, then we can describe the output $f^n(k)$. by  a triple $(k,f,n)$. Hence the complexity is less or equal the complexity of the sum of each ingredients: $K(f^n(k)) \leq K(k)+K(f)+K(n)$. But assume we have to analyse the output $j=K(f^n(k))$ without knowing what algorithm $f$ did produce it. Then, in many cases, the apparent complexity is much higher, $K(j)>K(f^n(k))$.

Another class of computations are integers sequences of finite or infinite length, such as Fibonacci or the ordered set of all prime numbers. Integers are sequences of length 1, rationals of length 2 (nominator and denominator), algebraic numbers $x$ are of length $k$ with $k$ the degree of the polynomial of which $x$ is a root, transcendental numbers are infinite integer sequences (interpreted e.g. as fractions) called Cauchy series. In standard analysis, such a series (of fractions) can have a supremum which is not within the series. And this supremum can even be non-computable, as illustrated by the Specker sequence.

Note that from a constructivist point of view in the philosophy of mathematics, one can also restrict analysis to the computable numbers, as they form a closed field. I sympathize with this view, not because I am opposed to the supremum (which has to be postulated to always exist), but because I don't believe that uncountable sets are indispensable in natural science. The Cantor diagonal argument does not prove that there are uncountably many real numbers, only that there is at least one which was not numerated in any enumeration. In the constructivist view the Cantor set is subcountable. What does that mean? Recall that the continuum hypothesis is logically independent of the ZFC (Zermelo-Fraenkel set theory with the axiom of choice). If one restricts to the computable numbers from the beginning (in the language of formal logic: restrict to the constructible universe) by invoking the axiom of constructibility $V=L$, then one obtains the (generalized) continuum hypothesis and the axiom of choice (and more) for free, just because it is the smallest universe consistent with ZF, and there is no room for anything else. In this constructible set theory (CZF), all sets are subcountable, which means that there is a partial surjection from the natural numbers onto it. The reason that in conventional analysis one takes all these Cauchy sequence (their equivalence classes) to describe real numbers is convenience, but it inflates analysis compared to what is actually needed. The rational numbers or if you like the computable numbers are completed to give the real numbers by demanding that all Cauchy sequences converge. The Specker sequence is a cauchy sequence, but according to constructible analsysis, the limit is not in the universe. 


In physics, there are also many examples of integer sequences, such as those arising in perturbative expansions. The question is whether everything science is about can be casted via integer sequences.

I assume that this is the case, that nature might as well be digital. I will address this issue (Wheeler's: "It from bit") another time. If Nature is about certain classes of integer sequences, Mathematics is about theorems between specific classes of integer series.

I believe that iteration - the repeated application of a rule - is at the heart of complexity in Nature, since iterations in computer science are in close analogy to physical dynamical systems. Complex patterns such as the Mandelbrot set or other fractals, or the output of cellular automata, can often be generated by simple algorithms, illustrating that the complexity of the algorithm is much less than what it seems to be by inspection of the data set it produces.

In order to distinguish the complexity of emerging patterns from the algorithmic complexity, I want to call it the Iterative Complexity in order to emphasize the role of iteration. It is not important here whether an algorithm is implemented by iteration or recursion, it is not even important to know the complexity of algorithm. Iterative complexity shall be a measure on the growth of complexity in a pattern by each iteration.

Sometimes, like in fractals such as the Sierpinski triangle, the iteration at work is apparent immediately. It is simple to recognize the pattern being best characterized as applying a rule of putting a triangle hole into each triangle. Such fractals are called regular, and they contain redundant information. For the Mandelbrot set, the iteration at work is much more difficult to see. In particular, the pattern is no longer self-similar in the strict sense and no information can be identified as redundant (apart from the reflection symmetry along the real axis).

These conceptual questions on the nature of complexity can be addressed without acknowleding the quantum aspects of nature. My hope is that the deterministic chaos can be understood via the notion of iterative complexity (still needed to be made precise) as well. Of course, the details become extremely involved as soon as the values of initial parameters matter, as is the case e.g. for the Lorenz attractor. In practice, since we can only measure parameters with limited precision, we will be unable to calculate the consequences, even if we could perform the calculation itself with any desired precision. A prime example is the three-body problem. However, it is not clear to me whether in a deterministic setup there can be randomness in nature (in contrast to the non-determinism in quantum mechanis: deterministic randomness expressed in non-computability of patterns), or whether the randomness we find in chaotic systems is always due to our ignorance about initial conditions, and there is room for a Laplacian demon after all.

Friday, March 09, 2012

Thoughts on the Nature of Computation

I am concerned with two question here:

(A) What is Computation, in the most general sense? What does the brain and  Conway's Game of Life have in common?
(B) Is (mathematical) Reasoning some sort of Computation?

In the last blog entry I speculated whether the concept computation is more restrictive than Mathematics, and whether laws of nature can be expressed in terms of computation.

What is the difference between Computation and Mathematics? Of course, Mathematics is concerned with proofs, whereas Computation is concerned with algorithms. But is there really a difference?

[Before you proceed, an important Note: I am not concerned here with scientific computing, which is about how to use computational resources in the most intelligent way. The notion of Computation I am up to is not concerned with numerics, parallel computing and the like, but about the nature of an abstract algorithmic process, not necessarily in time. I want to think about Computation in a hardware-independent way.]

Let us start with Mathematics: The difference between Mathematics and Physics in my view is:
Mathematics is about all (logically) possible Worlds, whereas Physics is about the actual World.

Convential (predicate) logic + some set of mathematical axioms (for setsnumbers, and usually some more exotic ones as well) leads to standard mathematics. Given that one assumes these axioms to be self-evident for any intellectual being, mathematics should be universal, that is each civilization in the universe should find the same mathematical truths.

But what is Computation then?  Let me contrast Computation and Mathematics by a couple of examples:
(1) Mathematicians want to know whether a set (of, say, numbers) is finite or infinite.
Computation is interested to give a description on how to calculate the elements of this set (e.g. prime numbers).
(2) Mathematicians want to prove that there exists a solution to some equation or that the solution is unique. If it is unique, they give the expression explicitly. Two typical examples are
Diophantine equations or general expressions for the solution of an algebraic equation (e.g. the p-q-formula), or differential equations.
Computation solves these equations for specific parameters or initial conditions (e.g. iterative methods or matrix inversion)
(3) Mathematicians want to enumerate all possible cases of (classification of GraphsRiemann surfaces, Lie-algebras, etc.)
Computation is concerned with calculating the numbers which characterize a particular structure in this classification (e.g. the Euler characteristics)

These examples shall illustrate: Computation, in terms of algorithms (and maybe something else), is of course more than just the calculation of numbers. But it never questions the framework, the set of theorems, it uses them.

Computer algebra programs such as Mathematica are just such well-defined frameworks for Computing, but they are not suitable to prove mathematical theorems. They can only give hints on how to prove something, as a heuristics.


Here are various candidate descriptions of what Compuatation is in contrast to Mathematics:
  1. Computation is a subfield of Mathematics, in particular it is applied mathematics
  2. Mathematics is about theorems, Computation is about its consequences.
  3. Computation is what Computability Theory and Algorithmic Information Theory is about.
  4. Computation is (sophisticated) Counting, like manipulating an abacus.
To each of these descriptions there are counter-arguments:
ad 1. & 2. Computation can be an essential part of a mathematical proof. For example the four-color theorem was proven with the help of a computer program. Computer-assisted proofs are not accepted by all mathematicians, and it is important to understand why.
Computation can also address mathematical theorems themselves: automated theorem provers are programs that do some sort of Computation (here, the framework is not algebra, but prediacte logic) in order to prove theorems, not to apply them.
ad 3. Computability theory is concerned with questions like the halting problem. It can tell what kind of algorithms produce a result within a finite number of elementary steps. Algorithmic information theory is about the complexity of algorithms. Both are important questions, but they are rather Meta-theories of Computation, categorizing specific computations into various groups. They do not address the content of computation.
ad 4. Counting is a mental activity. An abacus is just a tool to do so.
A machine that is able to perform computations, say a Turing machine, can be described by states which can be mapped on the natural numbers. Each of the possible instructions acting on a set of data can be regarded as an elementary arithmetic step like an increment. However, the program which tells the machine what to do is given as an input as well, and the complexity of what the machine does is encoded in the program. Hence Computing should focus on the algorithm itself, not on the realization of the algorithm in a specific setup.

I would like to argue in favor of a view that there are systems which can be described purely in mathematical terms, but which are too complex to study. Mathematics is somehow concerned with the simple truths, but there are also complicated truths out there, which cannot be found withouth some sort of computation, because the mental capacities of humans for reasoning have limitations.
This is of course along the line of thought of Steven Wolfram, "A New Kind of Science", however, Computation, in my view is not a new science at all, but it should be a paradigm to understand the common features of various phenomena such as evolution, the connection of brain and mind, and emergent properties of complex systems in general.

I conclude that the concept of Computation is context-sensitive, it requires a system of formal rules; Mathematics, in contrast, is universal and hence context-free. One might object that logic sums up the laws of thinking, but I am convinced that it is the law of thinking of man, but of any intelligent being.

This brings me to my last topic, the question whether reasoning is some sort of computation... Reasoning is what minds do - not brains, if there is a difference - but we have seen that there are programs of automated reasoning.
This question is not about whether the brain is something like a computer. I take it for granted that the brain processes inputs and generates outputs via a neural network. If this neural network was static, it could be simply characterized by an high-dimensional function relating inputs to outputs.
The question is whether this real time dynamics of that neural network has anything to do with the content of the mind, and if so, whether this process can be characterized as a specific computation.

If we do not know what Computation really is, we cannot reject the hypothesis that our mental states are a result of Computing.

I apologize to the reader that these thoughts were not discussed systematically here, and touching so many different subjects might have caused some confusion. I will address some of the raised issues in greater detail soon.

-----

PS: One last comment: not every physical phenomenon that can be studied via computer simulations (the weather/climate, lattice QCD) is a case of (exact) Computing  I have in mind here. These simulations use approximations (involving rounding errors) and/or pseudo random numbers (such as in Monte Carlo simulations). In actual simulations, one often has to trade accuracy for efficiency.
I rather think of computations like exact enumerations, which produce unambiguous answers without statistical errors.



Tuesday, March 06, 2012

Hegel's paradox revisited

Six years ago I summarized in this blog some thoughts on what I called Hegel's paradox.
I was puzzled by an argument (actually an anti-thesis) Hegel calls "verkehrte Welt" in his Phenomenology of Spirit (1807). 
This anti-thesis occurs after Hegel discussed the "Realm of Laws [of nature]" (the thesis to be attacked, see my old post) and it captures a beautiful ambiguity of two meanings, "wrong world" and "inverted world":

  1. wrong in the sense that the view outlined in in the section on "Realm of Laws" is unsustainable, and
  2. inverted in the sense that the view is upside down and needs to be put on its feet again.

This is dialectics at its finest.

In short, what still bothers me is the distinction between laws of nature on the one side, and particular entities (such as particles, or sensations) on the other side. Hegel attempts to overcome this dualism, but his movement of thought leads almost directly to consciousness as its synthesis and shortly after to self-consciousness, which leaves me with many questions. I believe that there is a paradox which needs to be resolved.
I agree with Hegel (if I understand him correctly) that the distinction between laws and particulars is somehow wrong, but I am not sure what conclusion to draw from this.

My first idea, about six years ago, was that both aspects can be captured in the notion of information.
But this is a vague notion and I made no progress in understanding what information is "behind the scenes."
Now I became interested in what can be called the Philosophy of Computation.  There is not much literature on this subject, but it might be a promising enterprise, and it might shed light on the paradox as well.

We are familiar with the analogy of the mind as a computer, and that of the universe as computer (see e.g. Konrad Zuse, Rechnender Raum). I like these analogies, but we should not think necessarily of digital (serial) computers similar to that I type this blog entry on. Also the brain does computations in some sense, although a neural network is very different from processors.

At some fundamental level, consciousness could be an emergent phenomenon of computation (rather than of specific natural laws). Computation produces information. Also, computation does not presuppose a person who designed a program or who is interested in the result. Some kind of computation might simply emerge in complex systems without any pre-defined aim. Think of the occurance of the Fibbonacci series and the Golden ratio in biological systems as an example.

The precise equations of the fundamental laws might even become irrelevant, as long as they admit a framework for computation. But then the question what is more fundamental - the laws or the computing process - is pointless.

Computation, I believe, is a more general concept than natural laws, but is still more restrictive than mathematics as a whole. This is promising.