Models and Prediction

In which I talk about Zones of Thought and offer a much clearer =D alternative framework to Popper on how scientific theories are established

27 May 2013

There are two ways for a model to be bad. It can be full of wrong assumptions or it can overfit. The typical example of overfitting is predicting the orbits of planets. The original theory of epicycles - where the Earth was the center of the universe - worked well enough until new data caused a revision that ever increased the complexity. The theory exhibited too much variance. And yet, it was still predictive and so cannot truly be labeled as wrong.

Indeed no model is ever fully wrong - or right - but a good modeler tries to be closer to right than wrong most of the time (i.e. good predictive accuracy). One example of a technique to achieve this (not the only) is a Bayesian model over possible theories. One uses priors as a regularizer - up bias, reduce variance to reduce overfitting - but also incorporates data to help discount bad theories.

But if a model is predictive why should one care if it overfits? Occam's Razor is often mistakenly used to defend simpler theories in a way that suggests that they are inherently better. But this is not true. Occam's Razor is only really saying prefer the model that 1.) saves energy and space 2.) is simpler. Preferring (1) is a matter of economy, why waste resources when there is no need? (2) is to avoid overfitting. If I have 5 points I can hit them all with a fifth degree polynomial - seemingly better than a simple linear fit. But this commits you to a particular model in the space of polynomials, more likely to go wrong than not. A simple linear fit makes far fewer commitments on the shape of the data and will do well enough in the near.

Focusing more on (1), what does it mean for a theory to save energy and space? We can pull on two types of complexity. Kolmogorov Complexity is a kind of descriptive complexity. How much space would it take to write down everything there is to know about this thing? Simpler more compressible things take less space. One can also take less space by ignoring unimportant details - a kind of lossy compression - i.e. general relativity vs Newtonian Gravity.

The energy aspect is in terms of computational complexity, which basically asks, how much energy would it take to compute an answer to this question? General relativity is not much more complex than Newton's Theory in terms of descriptive complexity - it's just a couple lines of partial differential equations. But it is much more computationally/energy intensive to calculate. Most of the time one does not need that level of nuance. Quantum Mechanics is a simpler theory in the sense of requiring both less computation^[1] and not much descriptive complexity.

Interestingly, this notion of better theories not being more correct but instead a trade off between being economical and predictive cleanly addresses the question of whether science allows us to paint an ever more accurate picture of the universe. The answer is no - although the question itself is ill defined. But what is true is that with science, we are continually improving our ability to model and predict things about the world. That is essentially what one understanding of Quantum Mechanics called QBayes says.

It states that Quantum Mechanics is the correct probability theory, the most accurate way so far to incorporate information about the universe and make predictions. That is, the wave function is not some many branched World-Tree of Fates nor a distribution over some underlying reality (actually a recent result shows this as not possible so anyone who believes this must believe Quantum mechanics is fundamentally mistaken - i.e. Bell Inequality inappropriate) but instead it is better to think of the world as if it was made up of tiny little things that exhibited their behavior in a manner commensurate with sampling from a complex-number probability space. Doing this will allow you to best predict what things in the universe will do. It does not say that the actual nature of the universe is probabilistic, only that you can hardly do better than that for some reason. It is very important to know that Quantum Mechanics does not demolish determinism. Indeed the World Tree Interpretation (aka many worlds) is deterministic - it can actually be a useful tool in thinking about some type of problems, even if the universe tree is not constantly growing new branches (sloppy, talking about the universe as if it is something is meaningless).

What about wrongness of assumptions? The easiest theories to spot as being wrong headed are often simple while failing to be a generalization of the phenomenon. Often the domain of so called cranks. Wrongly complex theories are much harder to do. In fact I believe with the exception of Mathematics (is the Riemann Hypothesis Proven? The answer is not known), humans are not intelligent enough to do that kind of thing. This makes a kind of sense, a wrongly complex thing has to be obviously wrong to some being, that you used complex suggests that you're just not smart enough to see how trivial and woefully inadequate the assumptions are. What about consistency?

Well it turns out consistency is not really a useful metric. All physics theories have a number of contradictions or are resistant to axiomatization. That is fine - no big deal. We are adept at maneuvering around the issues. Where as one can have a consistent but cumbersome theory of epicycles that is clearly not the best tool. The key issue is in contradictory assumptions/postulates more than mere inconsistencies within the theory.

So in summary, you want economical yet predictive theories. No one actually tries to falsify theories, instead they keep adjusting and fiddling until a clearly superior (low Kolmogorov and computational complexity, high predictability) alternative arrives .e.g Aether vs Special Relativity.

: 1 - On what grounds do I justify that General Relativity is more computationally intensive than quantum mechanics? Well, to mere polynomial machines, everything can be done but with exponential slow down. This muddies intuition. Another approach is to invert the question and ask instead: how much computation could a universe, where particular aspects of theory are physical and harnessable in principle, do? If you have read Vernor Vinge's Fire Upon Deep then you will have read about Zones of Thought. Wikipedia describes it as:

The novel posits that space around the Milky Way is divided into concentric layers called Zones, each being constrained by different laws of physics and each allowing for different degrees of biological and technological advancement. The innermost, the "Unthinking Depths", surrounds the galactic core and is incapable of supporting advanced life forms at all. The next layer, the "Slow Zone", is roughly equivalent to the real world in behavior and potential. Further out, the zone named the "Beyond" can support futuristic technologies such as AI and FTL travel. The outermost zone, the "Transcend", contains most of the galactic halo and is populated by incomprehensibly vast and powerful posthuman entities.

We are on outside edge of Slow zone, but how could we imagine the universe differently with each theory? General Relativity is an exotic theory. It allows for time travel, FTL travel, and Closed Time like curves (CTCs). A universe with the full power of General Relativity would be an incredible one with god like entities. In terms of computation, depending on how we look at it, CTCs allow anything from solving PSPACE (already amazing) to All computational problems (well excepting decidability, it would not be a mystical hypercomputer, it can't in principle compute more than a Turing Machine). A universe of General Relativity would be like the "Transcend". Not only is AI possible, they would be god like intellects able to easily solve NP problems that we believe would take exponential time now.

On the other hand Quantum Computing would take us to the low to middle beyond - except no FTL (no FTL communication either contrary to common belief). We learn a lot about molecular dynamics, biology, materials etc. We end up being able to simulate and accelerate a great deal of science. But huge swathes of algorithms, including in AI, are left virtually untouched.

This is another thing we need a theory of quantum gravity for. Would it give us computers more powerful than general relativity or somewhere between QC and GR (my bet)? What are the most powerful computers we could possibly build? What zone of thought do we inhabit?