What is Cognition in AI and Humans? Part II

On the emergence of meaning

Mar 21, 2025

In the previous post, I provided a few examples of how our cognitive processes are tied to what IT specialists call ‘information integration,’ or what neurologists and cognitive scientists refer to as ‘feature binding.’ This highlighted the intimate connection between cognition and meaning-making, and that cognition cannot be reduced to mere information processing or Darwinian adaptive processes.

The question is: how can modern AI systems be relatively efficient in at least mimicking the ability to extract semantic content from physical information, and how does this differ from human cognition? A few selected examples may clarify the issue.

For example, a neural network for pattern recognition works by mimicking the way the human brain processes data. It consists of layers of interconnected nodes, or "neurons," organized into an input layer, one or more hidden layers, and an output layer. Each neuron receives input data, processes it using a weighted sum, applies an activation function and passes the result to the next layer. Over time, the network learns to extract and recognize complex patterns from raw data — whether it's recognizing faces in images, detecting speech, or identifying anomalies in data. The figure below illustrates how a multilayered neural network processes the pixels of an image of a handwritten “2” and integrates them to activate a cell labeled as “2.” In a sense, this is information integration, a binding that collapses all data into a symbol.1

A neural network recognizing the handwritten number “2”.

Can this be called ‘pattern recognition’? In a sense, yes. ‘Cognizing’? Well, only a machine-like cognition. Is this real semantic ‘understanding’? I doubt it.

This information processing can be expanded to a multidimensional integration. For a further illustration of 'binding,' consider how a neural network categorizes both an object and its features within a multidimensional vector space.

For example, these are the elements of one vector distinguishing between the entry ‘apple’ and ‘pear’:

\(\begin{align} Apple = \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} ; \hspace{5mm} Pear = \begin{bmatrix} 0 \\ 1 \\ \end{bmatrix} \end{align}\)

While the elements of the second vector designate its color as ‘red,’ ‘green,’ or ‘yellow’:

\(\begin{align} Red = \begin{bmatrix} 1 \\ 0 \\ 0 \\ \end{bmatrix} ; \hspace{5mm} Green = \begin{bmatrix} 0 \\ 1 \\ 0 \\ \end{bmatrix} ; \hspace{5mm} Yellow = \begin{bmatrix} 0 \\ 0 \\ 1 \\ \end{bmatrix} \end{align}\)

How can we integrate the information that an apple is green, or the pear is yellow? We can represent them with 3x2 matrixes, where the columns distinguish between an apple and a pear, while the rows stand for its colors, as follows:

\(\begin{align} Green \,\, apple = \begin{bmatrix} 0 & 0\\ 1 & 0\\ 0 & 0\\ \end{bmatrix} ; \hspace{5mm} Yellow \,\, pear = \begin{bmatrix} 0 & 0\\ 0 & 0\\ 0 & 1\\ \end{bmatrix} \end{align}\)

While the statement “a red apple and a yellow pear” (case c) could be synthesized as:

\(\begin{align} Red \,\, apple \,+\, yellow \,\, pear = \begin{bmatrix} 1 & 0\\ 0 & 0\\ 0 & 1\\ \end{bmatrix} \end{align}\)

Obviously, this is an extremely simplified explanation of what occurs in a computer. A LLM can contain millions of neurons, each connected by synaptic weights totaling hundreds of billions of parameters, which outcome are represented by multidimensional vectors, matrixes, tensors, etc. However, this overview provides a foundational understanding of what it means to 'bind' or 'integrate information' in a machine.

Does the machine ‘cognize’?

It certainly doesn't align with the human understanding of 'cognition.' When we look at a green apple or a yellow pear, we don't perceive vectors or matrices.

However, one might argue that the machine does possess a form of cognition. We shouldn't assume that human-like cognition is the only possible one. In fact, representing reality as a multidimensional vector space enables AI to engage in meaningful conversations with us.

Fair enough. Yet, without succumbing to anthropocentrism and placing our cognition at the center of the universe, we need to explore further the differences between human and machine cognition.

The challenge lies in understanding how the multiplication of vectors and matrices, represented as digits in a memory chip, translates into the rich experience of perceiving the 'greenness' of an apple, feeling its shape and size, and even tasting it. A memory cell labeled "apple" or "pear" is merely a symbol encoded in the physical state of a circuit. There exists a subjective and experiential dimension that cannot be encapsulated by a symbol alone, regardless of how complex the computations leading to its identification may be.

This issue is not exclusive to machines; it also applies to humans. If you have never tasted an apple, no amount of description—be it through language, science, mathematics, computation, or neurological explanation—can truly convey what that experience is like.

I asked ChatGPT to describe the taste of an apple. Its answer:

An apple's taste depends on its variety, but generally, it offers a balance of sweetness and tartness. A crisp, fresh apple bursts with juicy flavor—some are more honeyed and mellow, like a Fuji or Gala, while others, like a Granny Smith, have a sharp, tangy bite. The texture adds to the experience: firm and crunchy at first, then turning into a smooth, refreshing juiciness. There's also a subtle floral or citrus-like undertone in some apples, making each bite both familiar and exciting.

That’s an amazingly vivid description of the subjective experience of what it feels like to taste apples. But does ChatGPT truly ‘know’ what it is like to eat an apple? Obviously this can’t be the case. It could come up with this description only because it has crawled the entire internet, indexing countless websites that describe the flavor of apples.2 It can mimic the human experience of tasting an apple only because it relies on texts written by people sharing their own experiences. Without this information, neither ChatGPT nor humans who lack firsthand experience can truly ‘understand’ what it is like tasting an apple, let alone ‘cognize’ that an apple “offers a balance of sweetness and tartness.”

This highlights what philosophers of mind refer to as the 'symbol grounding problem.' This problem addresses the challenge of understanding how symbols in computational systems—whether they are words, numbers, vectors, matrices, streams of bits, signals, or more complex representations—can convey meaning without being grounded to sensory experiences or real-world interactions. There exists a fundamental gap between a simple textual description of the world and the meaningful mental state that arise from it. Words and sentences do not truly represent ideas; instead, they evoke them. In fact, modern neuroscience and linguistics are finally realizing that language is a low-dimensional reduction into labels of what constitutes our rich cognitive phenomenal experience and, while it has provided humans with significant advantages over animal intelligence, it has also diminished certain cognitive abilities. You can’t capture the essence of a beautiful sunset in a single word. While poetry can convey the deeper feelings it evokes, only those who have directly experienced it can fully understand its meaning across all psychological dimensions.

This also tells us something about language itself. The nature and origin of language has to be found in what goes beyond symbols, tokens or bits. But you don't need to be a mystic; simply practicing meditation shows that, without language, there is greater mental clarity, not less. Language is the final rung on the ladder of conscious expression, not the starting point.3

“Ultimately, language is both an enhancement and a constraint, a double-edged sword that structures thought while simultaneously obscuring other modes of being.“ - Jeremy I Skipper

If meaning is not found in the arrangement of symbols, then who or what generates that meaning?

The same can be said for how the brain represents information. Neuroscience suggests that everything we perceive through our senses is mapped and coded as neuronal representations. All the stimuli we encounter and the thoughts we generate are stored in a configuration space as synaptic weights within a neural network.

This is certainly true, but can’t be the whole story.

For example, the figure below on the left illustrates another 'Gestalt switch,' depicting the well-known ambiguous image of an old woman and a young woman. A neural network may recognize one or the other figure, with the choice depending on its synaptic weights that, however, can’t solely be determined by ‘the world out there.’ Because the physical information is exactly the same in both cases. The decision may be influenced by the context, our current mental or emotional states, or other factors. However, we also have the ability to make a conscious choice to switch our perspective. There is a volitional ‘will-factor’ that enables us to see things in one way or another.

The young/old woman ambiguous figure and its recognition by a neural network.

These Gestalt figures highlight a broader truth: our perception of the world is not a direct reflection of reality, nor is it merely an interpretation. Instead, it serves as a symbolic transcript—essentially a sign or token that represents something else but holds no intrinsic meaning until the mind ascribes it one. We don’t 'cognize' reality as it is; we 're-cognize' it by re-constructing arbitrary (and often false) appearances to meet our needs and evolutionary demands. Ultimately, nothing is recognized beyond a narrative of illusory appearances, even though they are quite useful.

And, again, there is something inherently different between the machine and human cognition. What the neural network ‘sees’ is neither a young or old woman. After all its ‘cognition’ reduces only to the on-off activation state of two cells.

In other words, while the physical state of a neural network is a much more complex configuration than the symbols in a typical text or the patches of a figure, it does not contain meaning on its own any more than the text or the figure do.

My objection to Mitchell’s argument is that he asserts “patterns of neural activity have causal power solely by virtue of their meaning.” “The meaning is inherent in the configuration of the circuitry.” The claim is that neural patterns possess causal power only because they represent or reflect something about the world or our internal states.

But, as I have shown, a neural configuration contains inherently no meaning. The notion of a neural representation is, in and of itself, insufficient to explain how meaning is instantiated. No more and no less than the vectors and matrices tell us something about how our mind instantiates the semantic awareness of a subject experiencing the sight of a green apple and a yellow pear. Or, no more and no less, how an association of symbols, such as written words on a piece of paper, elicits concepts with meaning.

Naturalists like Mitchell are motivated to provide these explanations as a response to reductive views, which argue that mental content and desires lack causal efficacy and do not play a significant role in behavior.

The premise is correct, but the conclusion is flawed. Since naturalists cannot resort to panpsychism, dualism, idealism, or any other spiritualistic metaphysical framework to defend free will they must create complex arguments that may initially seem plausible but fall apart under scrutiny. Naturalism has trapped itself in a corner; it refuses to abandon its mechanistic worldview while also resisting the logical conclusion that we are, at best, mere ‘meat robots,’ and it tries hard to resurrect some form of free will by magic. It wants to have its cake and eat it too.

Therefore, something else must intervene to collapse a neural pattern into one semantic whole attended by a conscious experience of a ‘perception of meaning.’ There is something special in that instant of binding and integration of information in between that suddenly materializes the meaning of things lived as a first-person subjective experience. There is that fleeting moment when an initially meaningless but very concrete sensory experience materializes all at once in a compound conceptual object of cognition.

If the meaning of things is not found in the external world and, thereby, also not in the arrangement of neurons in our brains, then who or what generates meaning? There exists a gap between the physical information that computers process and the semantic information that we, as sentient beings, experience.4

Readers familiar with the philosophical issues related to the mind-body problem will recognize that this ultimately boils down to the so-called ‘hard problem of consciousness.’ We still do not understand how the unconscious, insentient, experience-less, and subject-less functions of neurons in the brain give rise to a conscious, sentient being that has experiences. There is an explanatory gap between information processing mechanisms and an account for conscious experience.

Here, we encounter a similar ‘hard problem of semantics.’ We do not understand how the unconscious, insentient symbols or neuronal patterns in the brain give rise to a conscious, sentient subject that has a ‘semantic experience.’ This is because meaning is inherently connected to consciousness. Integrating symbols or any complex mechanistic processes into a coherent semantic whole inevitably requires conscious experience.

Cognition, particularly the kind exhibited by living sentient beings, cannot exist without conscious experience. To cognize is to grasp and integrate sensory data into a coherent mental object, allowing us to perceive and comprehend meaning. Understanding is rooted in consciousness; without consciousness, true understanding is impossible. Just as there can be no seeing without the experience of light, and no hearing without the experience of sound, meaning is derived from the experience of a mental phenomenon. Meaning-making is an expression of consciousness, rather than a mechanical process that generates conscious experience as a fortuitous byproduct. This explains why we often find ourselves in a cognitive state where we understand something deeply yet struggle to articulate it. Words, tokens, bit and bytes fail to capture the perceptual experience because they are merely symbols that refer to an experience, but do not embody and ground it.

In part III, we will further explore this line of reasoning to uncover its implications.

Letters for a Post-Material Future

Discussion about this post