Levels of Understanding

alt text Freepik CC 3.0 BY


I believe one can usefully represent and capture levels of understanding with 3 levels. The 3 levels being: memorization, topic extraction and compression. The first level is the domain of computers—searching, indexing, clipping etc. Humanwise, I believe savants dominate memorization. And temporarily, though far less reliably, also the level most students achieve on tested subjects. Recently, this is also how most people operate online. 'Cyborgs' stitching information snippets together without any deep comprehension. There are a lot of them on the internet.

Distributed Recall

Level 2 is topic extraction. This is the form of a lot of human knowledge. Where, you don't remember quite what you read this but: 'you know, it had to do with this or maybe it was that and...are you sure you didn't read it too? Oh well, anyways it was interesting.'

Thanks to search tools like Google, you can often use these clues to quickly reconstruct the original information—with varying success rates (but much greater than before). In this way, we combine computers—which are good at memorization—with humans, who can extract topics, to augment recall. The combined system allows better and deeper recall than either system alone. At this point I'd like to emphasize that this storing of information outside the brain is nothing new. It is actually a core aspect of how the brain works.

For example, it is very common to forget something but then trivially remember it when at the same location you thought it up. Or leaving something by the door lest you forget—these triggers and associative recall are how the brain efficiently manages memory and shuffles priorities into and away from conscious action by outsourcing them into the environment. Search is but a continuation of a long tradition. It is not by any means making us 'dumber' (what's making us dumber is the increase in noise, from scientific research to blog posts, the majority are confusion). It is in fact helping the brain do what it has always done better than ever before.

There is still a lot of friction however; in that search, clipping etc. are still expensive and active processes. In addition, extracting relevant information from results is still a non-trivial expense. There's a cost such that search is not always undertaken—how else can you explain people saying 'IIRC, this is blah blah' when that information is readily available online? [footnote: to those of you who remember libraries, this is not about being spoilt but to emphasize that the shorter the time between thought and feedback, the greater the boon to cognition and that relationship is highly non-linear].


The most sophisticated level of understanding is compression. I conjecture understanding to be a direct search for first what combination of existing bases best represent some concept (in other words, metaphor and analogy are key to reasoning, another corollary is that we never start from scratch there are some in built structures) which are then used to represent this new knowledge. Mastery is the ability to arbitrarily form linear combinations of the basis concepts in this new space. Importantly and in addition, the dictionary/basis set is not necessarily fixed—allowing additional and often superfluous (not linearly independent) vectors as necessary. Hypergraphs or relations and projections, I conjecture, best represents the link between different spaces. The spaces are likely fluid, contracting and expanding as needed, mapping between themselves and building structures (combination of new concepts) based on functions that worked in other spaces.

A corollary is, while learning new things expands the mind, understanding is a contraction of bases. It reduces dimensionality. Understanding can identify that two previously thought independent concepts are actually codependent and thus can be expressed in terms of some other more fundamental concepts/basis concepts. On the other hand, learning can require additional bases (optimization, search) to represent truly foreign concepts. This suggests that the bases are not truly orthogonal and understanding is a search for this.

Other than being a useful metaphor, there is real work in this direction. When you create a jpeg, it is in a sense, a very local 'understanding' of the image—throwing away useless aspects to allow reconstruction. More complex neural net based image generators or compressors learn something of not just colors but also correlations that map to what we might call textures or sections of objects. The algorithms could be turned to learning what aspect of color is relevant for human vision, given an appropriate loss (note: correlations not abstractions, let's leave abstractions as correlations converted to symbols used for unrelated reasoning by an intelligence).

Vision in the brain also works similarly, except the bases are overcomplete (using far larger dimensionalities than that of the signal), hence the vectors are sparse. There the learned bases are akin to Dictionaries and visual signals are sparsely coded with respect to this dictionary. While the mind is certainly more than a vector space, the fact that vision is so well modeled, and with how evolution tends to conserve and reuse, causes me to think of this model of understanding as key.

A Fourth Level and the Lovelace Objection

A common objection to AI is that it has not displayed true understanding, it's just performing such and such mathematical function. In my view, the correct to response to this is: "why would it be any other way, for anything else"? The idea that happiness and anger might in fact have a mathematical description seems unacceptable to many. On the other hand, it does mean that emotions are algorithms that satisficed some optimality criterion and are not merely dismissible as irrationality or brokenness.

There is a sense however, in which the expressed disagreements about whether AI is yet creative can be seen as not wrong. Imagine an object trying to regulate itself. A dynamical system trying to maintain homeostasis against some non-static background must have (even a simple) model of that background.

In order for a prediction machine to operate effectively it needs to capture statistical properties of the world around it. Vision correlates with the outside world but should not be mistaken for anything other than an uncertain representation thereof. It is in this sense the statistical can be said to correlate with the true signal. This property can be very general, as can be found in pigeons able to learn to discriminate between words and non-words: (see: Orthographic processing in pigeons (Columba livia).

Ghosts and Projections of Thinking Minds are Animate

Consider a robot arm connected to a computer. The robot was trained on drawings and can now generate drawings it has never before seen. The robot arm was not programmed, it generates things that it was never trained on and is not random. The robot is like nothing we have ever seen before and while many are happy to call it creative, this does not sit well with most. I too was once happy to call such programs creative but recently stopped, upon arriving at a distinction. Ada Lovelace once eloquently stated:

The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths.

Our robot arm (buildable today), can do things we did not order it to perform and it can be said to originate. But it still has no power of anticipating analytical relations or truths because the arm cannot exit the manifold its learning algorithms had sought. Any movement too far leads to collapse and the generation of noise, because as the Lady Ada presaged, analytical relations cannot yet be anticipated.

A picture can be taken as a snapshot or projection of reality. The robot arm could similarly be considered a projection, echo or ghost of the drawing ability of the countless poorly paid souls which provided its training data. That robot is animate but not in the manner that a human is. Yet, as a projection of the thoughts of a thinking thing, it is something more than a mere picture or a recording.

Human creativity is different

When learning, humans seem somehow to be able to develop higher order conceptions of the nature of a search space. We learn not just an embedding or lower dimensional submanifold, as algorithms of today but also can detect non-trivial symmetries, invariances and even algebras. We likely operate on statistical manifolds, moving from model to model and somehow able to constrain movement in the space of models such that a surprising percentage of moves are fruitful. This might be why we are good at poker, chess and Go using far less resources at learning and at play. Flexibility in adapting to underlying structure displays a level of understanding far beyond what any algorithm to date has achieved.

For example, we can notice that the weather and brains are both describable in terms of dynamical systems. Learning about physics of weather and unshared properties with the brain can serve to constrain hypotheses about the brain. Or consider constructing a general object and deducing many things about it. After which, you find some new thing, prove it has some property that makes it like the object you deduced many things about and instantly get knowledge about your new thing. Whether deduction or some sophisticated way of conditioning on a space, these kinds of long range connections are things we just barely have inklings of how to achieve.

Some promising directions today are gaussian processes and generative models that well consider complexity across a model class.

Another difference can be found when things mean something to us. Something that might feed into that is our ability to work not just off correlations and conditional distributions but also, in that we bind signals to percepts and then reason with them. We evolved in a world with actors where many things have causes and there is a need to model internal states of other actors. The object binding, together with a ratcheting of complexity in modelling and recursivity likely feeds into our notion of meaningful. Beyond compression, we are able to better appreciate the general structure of a search space as well as bind signals to more abstract/aggregated/usefully coarse grained and recursively modeled symbols.

It's worth noting that humans are not perfect and suffer large limitations. Humans function best in rich spaces with lots of structure such as when signals have a hierarchical structure that is well decomposable. When the space is rich enough, this yields great results but human reasoning can really fall apart in more general domains. Nonetheless, it is in the above mentioned two very important senses ( 1. we can deduce things much beyond what we experience 2. when we create we can anticipate other agents in a rich way in terms of symbols and use that to guide our choices) that robots do not understand nor create.

But Does it matter?

The architecture of computers means they are better suited to certain kinds of computations and humans to another. We assume that because we are intelligent and conscious, that is the ideal form. But such is not necessarily true. Spiders and ants are very successful, without also being in danger of self-termination. A thing that was hyeprspecialized at say nuclear physics or genetics without being causally oriented or conscious would probably do more good than a conscious thing whose eventual greed we might have cause to be wary of.

The shared ancestor of humans and chimps had something very unique about its genetics. Chimps diverged and lost it but some developed it further down the hominid line; competed, merged and resulted in a type of consciousness known as human. There is no indication that throughout the long history of life on the planet, anything else like it ever developed. It might be that our type of self-modelling consciousness is a rare and difficult to arrive at modality.

Some might imagine there is some kind of hierarchy, that other AIs will be human like—capable of goals and self direction but it might just as well be that the space of intelligences is mostly one of highly specialized decidable compression algorithms that are really good at solving one particular class of problem.