The Language-World Gap
How language creates a barrier between the world and ourselves
I will briefly argue that the propositions contained within language expressions cannot be evaluated as true or false in a traditional way, and that consequently there exists a gap between what can be expressed and reality. I started thinking about this problem soon after learning about compression algorithms related to digital images, and it will soon become apparent why these are relevant. My argument relies on several assumptions about the nature of language, and how information is encoded and decoded. I shall start off with a quick refresher on compression algorithms.
Compression in computer science is a process by which data is reduced in size with minimal loss of information. There are a variety of compression algorithms used in computer science, depending on the type of information they are designed to compress. However, a lot of them rely on the concept of redundancy reduction. If we wish to compress a large image file for instance, we can write a function that takes the image as input, and outputs a list of instructions with which the image can be reconstructed.
If, for example, an image is a pure blue square, 800 by 800 pixels in size, then the amount of redundant information inside the image is maximal. The instruction to reconstruct the image is just that: place 800 blue pixels in a row, 800 times; a great deal of compression is achieved, since we must only save a single pixel, and an instruction to replicate it.
In a picture of a cat, the instructions to recreate the image will be somewhat more complex. Even with a photograph, however, not every pixel is different. There will be plenty of blue pixels clustered together depicting the sky behind the cat, and many of the furry bits can be pretty similar in appearance as well. Thus, the compression function could say something like: 75 blue pixels, then 3 brown pixels, then 18 pixels of a slightly darker brown, etc.
The compression in the case of the cat image will be less efficient than the compression of the blue square, simply because there are fewer redundant pixels that can be eliminated and replaced by copy instructions. It is important to note that the instructions themselves also take up space, and thus it is possible for a compressed image to actually be larger in size than the original. This happens when the amount of redundant information is less than the instructions needed to describe the image. One of the most extreme examples would a picture of white noise in which black and white pixels (or coloured pixels for that matter) follow each other in more or less random order. The set of instructions necessary to recreate this white noise image is larger than the image itself.
There exists the further possibility to use a different type of compression algorithm. Instead of trying to reproduce the image from instructions pixel by pixel, we can reproduce the “essence” of the image, by recognizing that it is white noise. Thus, we simply instruct the decoding function to produce an image of white noise. From the observer standpoint, nothing much has changed, but in reality, we’ve done two very different things.
In the former compression, we’ve cut out redundancy, but preserved all information. In the latter account, we’ve destroyed information, and reproduced an approximation of the original. This is a one-way street, in that the original image cannot be reliably recreated exactly as it was.
Artificial neural network encoders and decoders use this approach, rather than a lossless compression, and the accuracy with which they can capture the ‘essence’ of an image is remarkable. Thus, they can reliably recognize pictures of cats, and reproduce them in very similar matter. But it is important to recognize that in this approach, the input image X, bares no relation to the output image X* other than in its accurate reproduction of key features of X. Let’s call this key feature method of compression “abstraction” here.
More compression algorithms exist. In 1822 for instance, the French mathematician Joseph Fourier recognized that many functions could be deconstructed and re-interpreted as the infinite sum of simpler functions like harmonics. This can be applied to images such that any image can be described as an infinite sum of sinusoidal waves, each with a slightly different coefficient. Instead of storing the image, all we need to do is store a set of coefficients, and we can more or less accurately recreate the image. Here too, information is inevitably lost, since we cannot store an infinite set of coefficients. High- and low-pass filters in image editing software are based on cutting off the list of coefficients early, or starting it late, such that only some features of the image remain.
Compression in language
We are confronted with the raw reality of the world we inhabit. The amount of information that surrounds us is vast and mostly inaccessible to us, as we are limited by our sensory perceptions. How exactly is information gathered and processed by our brains such that it can be made useful? And further, how can we communicate this information to others?
I would argue that natural language functions much like the abstraction type of compression, by simple necessity. It is impractical, and probably impossible, to describe things without loss of information, and it is not clear what that would even look like. This is both due to the limited acuity and breadth of our sensory perception, and the amount of time it would take to encode and decode all non-redundant information. Similar to the while noise image example, a full description of the universe using natural language would probably not fit within the universe, since language is not fine-grained enough to be sufficiently accurate.
Besides, what is the real world equivalent of a pixel? That is, what is the smallest piece of information accessible to us, which could be used in a lossless information transmission function? I don’t know.
Thus, I think the way our minds, and ultimately our language, encodes and decodes information is probably more similar to how neural networks do it. That is, we extract the necessary key features of what we wish to describe, and then let the other person unpack these concepts based on their own neural decoding structure.
If we agree that language is like the abstraction type of compression then we run into a problem of reference, or a lack thereof. Let’s recall the simple case of a white noise image. The key features of white noise are just that: white noise. That is, the easiest compression of the image is achieved by describing its essence, rather than its exact contents.
Image A → Encoding (“white noise”) → Transfer (language) → decoding (“white noise”) → Image B
Importantly, image A and image B are not the same image. They bare no relationship to one another, other than being sufficiently similar in their essence, or key features. Sufficiently similar to be useful to us, that is.
However, how can we then make proper reference to the original image, given only image B? Image B is not identical with image A, and Image B does not refer to image A. So in any proposition in which image B occurs, how can the proposition be evaluated as being either true or false, if image B refers to nothing particular in the world? Where is the truth condition for the proposition?
Here is a more structured way of putting it:
1) Reality is real and independent of us (basic assumption)
2) We have access to reality via our sense perceptions (more or less, see below)
3) We cannot transmit information about reality in a lossless way
4) Therefore, we encode information in language via abstraction (as described above)
5) The objects created by this abstraction have no real world equivalent, since they are merely reproductions of key features of reality
6) Therefore, the objects created by language do not directly reference anything particular in reality
7) Therefore, any evaluation of the truth value of propositions containing these objects have no particular truth making conditions in reality
8) Therefore, propositions containing these objects cannot be evaluated as being true or false
The second premise is a simplification in itself, since our sense perceptions are likely afflicted by a similar problem as our ability to transmit information: They are not able to perceive reality in a lossless way. Thus, the inputs to our compression function are themselves already an abridged version of the underlying facts. From this limited and fallible pool of sensory information, we extract key features, and construct language objects, which we transmit to other people. These language objects are not equivalent to our sensory input, in the same way that the reproduced image was not equivalent to the original. They are merely approximations based on interpretation and encoding/decoding of key features.
Since these language objects are reproductions, they are not referencing anything in reality. We communicate by passing around these objects, but as they do not reference anything in reality, it is difficult to say what their truth conditions would be. What fact-of-the matter would make a proposition true or false that contains abstract, reconstructed language objects? Let us call this the language-world gap.
Even if true, the question is whether this argument holds for all language-based communication. We may grant that certain types of expressions are more prone to compression that others. Indexicals like “this” or “that” could be seen as exempt from any compression, because they directly point to some real world object.
Proper names could be seen as equally exempt, since “Henry” is not a description of reality, it is a pointer, much like a memory address in computers. A name points to something in the world without describing it, and thus there is no loss of information, nor the need to somehow abstract key features of the intended reference target.
Argument from Intent
Intention is another possible counter to the above line of reasoning. Often it is said that it is the speaker’s intention that creates a proper reference to the real world. Without intention, there can be no reference, or aboutness, in the expressions or objects discussed. An example of the importance of this type of intention can be found in every-day life: A child’s crude drawing of a stick figure is in fact a picture of the child’s father, simply by virtue of the intention of the young artist. Obviously the stick figure bares almost no resemblance to the actual person it is meant to represent, but because of the child’s intent, it is now a fact-of-the-matter.
Thus, in the above argument, premise 6) and 7) are rendered inconsequential, because the reference is made not by virtue of some information within the encoding or decoding process, but simply by an act of will on behalf of the speaker. The speaker intends to refer to a real world object, and thus his proposition is in fact about that object, and not an abstracted language object.
Another way to attack the language-world gap is by considering Bertrand Russell’s arguments about the underlying structure of propositions. There is an analogy here between the language-world gap, and the puzzle of referring to non-existent people discussed by Russell.
If we say of the present king of France that he is bald, we are making a claim about a non-existent person, which means that the expression lacks a proper reference. The language-world gap argument essentially claims that all expressions lack proper reference, in a similar (but not perfectly identical) way.
Russell proposed a model of language by which the underlying structure of propositions eliminates all references to entities, and instead takes the form of a conditional. Therefore, the truth or falsehood of propositions can be evaluated regardless of whether the apparent referent of the expression exists.
“The present king of France is bald” actually expresses the underlying conditional proposition: “for all things, if there is a thing that is the king of France, then that thing is bald.” If it happens that no such entity exists, then the conditional is simply false. Problem solved.
The Problem with Intention
The argument from intention is the most serious counter to the language-world gap, and the other points mentioned are part of the wider intention family of arguments.
In response to the argument from intentionality, it may be useful to rephrase the entire discussion as follows: The language-world gap argument is very much an empirical argument, as it builds on the methods of encoding and decoding communication. The argument from intention is not empirical. It makes a move that metaphysicians have been making for a long time: forget about how we can know something, just assume that we do. Metaphysics is not the study of knowledge or transfer of information and its limitations. It is the study of how things actually are. The problem is that we have no idea how things actually are. Metaphysics makes an illegal move, when it attempts to solve the problem by ignoring it.
I would argue that there is no such thing as “God’s perspective,” or at least, there is no way we can have access to it, even if it exists. Any argument relying on “God’s perspective” is invalid from an empirical point of view, because it contains the hidden premise that we can claim anything whatsoever in the name of this perspective without proper recourse.
The argument from intentionality is of this type, because it proclaims that the intention of the speaker somehow makes it a fact-of-the-matter what propositions are ultimately about. However, this strikes me as a magical move. By what mechanism does the world now contain a fact, independent of the individual, that links a perceived object with an utterance of natural language?
The kid drawing a stick figure with the intention that this masterpiece represent his father is a fun example. It illustrates nicely how a theory of intention seemingly solves a lot of problems. It is elegant, simple, easy to understand. What is this drawing about? The child’s father, obviously. Why? Because the child intended it to be so. At the very least, proponents of the intention argument will claim that intention is a necessary, if not sufficient, condition of proper reference.
If we imagine an ancient civilization creating stone carvings that look like aliens, yet none of their language survives, and we discover these works of art, how can we make any judgement on what these carvings are about? Similarly, if by some chance a natural phenomenon creates stone carvings that look like aliens and we discover these, how can we determine what they are about? Let us assume that both types of carvings, by the most amazing coincidence, look exactly the same. Where is the fact-of-the-matter that the former is about aliens, and the latter is about nothing whatsoever (since there is no intent behind natural phenomena, presumably)?
(In fairness, it may be possible that I am conflating the mechanisms of reference within objects like paintings or carvings, and the mechanisms of reference regarding language expressions. It may well be that my analysis here is too shallow, and that these two types of reference actually represent two distinct phenomena.)
The problem with reference seems to apply to language expressions in a similar way. Within the encoding and decoding of information through language, there is simply no fact-of-the-matter contained within the exchange that transmits proper reference. It is merely us and our brains that do a lot of interpretation for us, and thus there appears to be no problem. We assume that proper reference is made, even though as I’ve argued above, this is not the case.
Names and indexicals illustrate this problem, because they contain almost no information in themselves, and are thus extraordinarily impoverished language objects. It is no wonder then that names and indexials are often precisely the types of expressions that lead to the greatest confusion and miscommunication (“Who’s on first?”).
If Sarah tells me that “Joe went to school”, then our common understanding tells us that this can be properly evaluated as a true or false proposition. Unfortunately, no.
First, whether or not Joe exists, for Sarah to be making a proper reference to him, is epistemically arbitrary from my perspective. That is, even if Sarah can make a proper reference to Joe, the transmission of this information to me is an abstraction in the same way as all other cases above, since Joe’s existence or proper reference are irrelevant to me. If I am epistemically in the exact same position, regardless of whether the utterance expresses a true or false proposition, then it is clear that the expression, and underlying proposition lacks the necessary information to make any kind of reference to the real world.
What I’m trying to say is that “Joe” in “Joe went to school” is not actually connected to Joe in the world, because this connection cannot be transmitted via language, and is merely inferred. The inference is made by me, as part of my interpretation, but it does not exist within the information transmitted. And that leads us back to the language-world gap: translating sensory input to language expressions is a one-way function. Joe, the person, cannot be reconstructed from “Joe” the expression, regardless of the intent of the speaker.
Indexicals are worse, since the amount of information transmitted is even less. What exactly do we mean when we point at something and say “that”? It may be clear to us from context, but that is because our brains are already doing a lot of interpretive work.
When pointing at a chair, how come I’m pointing at a chair and not some feature of it, or something larger which in turn happens to contain the chair as a feature? I could be pointing to the entire universe in that particular direction, and the chair is merely a part of the universe in that direction.
Clearly, the intention of the speaker gives the indexical its proper assignment in the traditional sense. But how can this intention be transmitted? We can’t read each other’s minds. We can merely recognize intent in the words of the speaker, by its resemblance to key features of intent that we recognize from past experience and our own usage of similar intent. But then we are back at the beginning: The fact that the speaker meant the chair and not some feature of the chair when he said “that”, is not transmitted by the speaker at all. It is itself an abstract object created by the action of the speaker, and the interpretation of the listener, or reader.
Second, the longer a description of a thing is, the more chances it has to transmit information with minimal loss. Thus, “a chair” is pretty brief as descriptions go (However, implicitly, there is a lot of description behind the word that we are all more or less familiar with. Under this interpretation “a chair” is merely an abbreviated description of a much larger concept, or a name pointer to a larger concept (in which case the language-world gap says hello)). “That”, on the other hand, is not a description at all. It transmits no information except through context. The actual meaning of “this” and “that” cannot be regained, once transmitted, and thus these indexicals merely create dummy objects which are abstractions and do not reference reality in a proper way.
The Problem with Russell
Although Russell’s theory, described in his paper “On Denoting” could offer an interesting solution to the language-world gap, it does seem to contain some problems. The given example about the present king of France is relatively straight forward, but there is an issue with the transposition of expressions into propositional logic. Russell’s propositional logic attempts to dissolve away any entity which could be the referent of an expression, but this process seems to be merely begging the question.
The expression “my dog is a happy dog”, according to Russell contains the proposition: “for all things, if there is a thing that is a dog, and mine, then it is happy”. However, this is an expression containing several entities. What underlying proposition does the expression “for all things, if there is a thing that is a dog, and mine, then it is happy” contain? Things, dogs, selfness, happiness are all abstract language objects, which themselves need proper referents. It seems strange to suggest that a simple rephrasing of an expression into a conditional can save us.
We could continue the atomization proposed by Russell and repeat the process again and again, but then we seem to end up at the question posed above: What is the real world equivalent of a pixel? What is the smallest possible piece of reality that we can access?
To summarize the entire idea very simply: we are not in fact talking about the things we think we are talking about. Instead, we are passing around abstract language objects that are not properly connected to reality, because of the particular form of compression (through extraction and expansion of key features) we are employing: things lack proper references. Because these objects are not properly connected to reality, it is not possible to evaluate the truth or falsehood of propositions containing these objects in a traditional way.
The empirical point of view here is somewhat analogous to discreet mathematics, while the entire intentionality argument strikes me as a continuous mathematics type of move. What I mean by this is that the latter may be more elegant, and satisfying, while the former is messy, imperfect, and usually incomplete in its description. Instead of trying to save some previous model about reference, language, and truth conditions, it may be easier to simply dump the whole idea of proper reference to begin with.
I suspect there is no way in which language expressions can be true or false, by virtue of the mismatch between coarse-grained language and fine-grained reality. That we can at best hope to me more or less accurate, because of the abstraction of both our senses, cognition and, ultimately, encoding and decoding mechanisms of language. I will leave this discussion for later.
I am not a professional philosopher, just an interested bystander asking some questions. This is not an academic article, as should be clear from the lack of references. All non-original ideas here should obviously be credited to their proper sources.