What is Cognition in AI and Humans? Part I

Knowledge and information integration

Mar 17, 2025

I was inspired to write this post after reading geneticist and neuroscientist Kevin Mitchell’s blog post, which offers some interesting insights.

The title is straightforward, but why is it important?
Let me explain…

Cognition is a fundamental concept in the cognitive sciences, referring to the processes by which humans and other living organisms acquire, process, store, and utilize information.

The point is that a question that once kept busy only a handful of cognitive scientists and philosophers has gained renewed attention in contemporary science, thanks to recent developments in AI and new findings in biology, with potential practical implications for AI, biology, and cognitive science.

The advanced intellectual capabilities of modern AI systems, such as Large Language Models (LLMs like ChatGPT), raise the question of whether these systems truly 'cognize' as humans do or if they simply mimic human behavior through statistical predictions. Researchers studying cognition seek to understand how the human mind operates, how intelligence develops, and how mental functions can be replicated in artificial systems. Their work offers valuable insights into both natural and artificial intelligence.

Meanwhile, in recent decades, biology has found that even plants and single-celled organisms exhibit forms of 'proto-cognitive' abilities. Cognition, or at least what is often termed as 'basal cognition,' does not require a brain. Remarkably, tiny unicellular creatures can sense their environment, navigate it, target food, avoid toxic substances, change their swimming direction, and even solve problems collaboratively to achieve a common goal.

Defining cognition is challenging, much like defining concepts such as ‘life,’ ‘consciousness,’ ‘mind,’ ‘thoughts,’ or ‘agency.’ The underlying question is: why are these concepts, which seem self-evident and integral to our essence and daily experience, so difficult to define?

Moreover, I would like to emphasize that contemplating questions where the boundaries between biology and psychology, artificial and natural intelligence, or even physics and metaphysics become blurred can be an interesting exercise. Because this reflection reveals our often unacknowledged premises and assumptions. Such discussions are essential for understanding how we think.

Before continuing with this reading, it may be helpful to attempt to define cognition on your own. You will soon discover that what appears to be a simple idea is, in fact, difficult to articulate in precise terms. Give it a try…

Experts offer a variety of definitions of cognition, encompassing concepts such as information processing and storage, problem-solving, goal-directedness, decision-making, planning, the ability to interact with and adapt to environment, etc. These definitions also include mental processes like perception, memory, attention, reasoning, and language comprehension. For a more in-depth exploration of how scientists define cognition, refer to Mitchell’s post and the references cited within. Mitchell defines cognition as: “Cognition is using information to solve problems and guide adaptive behavior.”

The numerous definitions highlight the uncertainty and controversy surrounding the concept. My issue with these approaches is that they attempt to define cognition solely in Darwinian evolutionary terms, portraying it as an adaptive function that improves an organism's survival and reproductive success. However, these definitions, grounded in an exclusive third-person perspective, fail to capture the essence of cognition and conflict with our first-person understanding of it. I believe my cognition extends beyond mere adaptive and problem-solving functions. For example, when I watch a sunset, listen to music, or immerse myself in a poem, I do not perceive these experiences as solely purposeful for acquiring data or adapting to my environment. In cognition, there is a qualitative aspect that can't be reduced to mere computation.

Therefore, allow me to take a different approach.

Sometimes, starting with exploring the etymology of words can provide valuable insights. According to the Online Etymology Dictionary, the term 'cognition' originates from a combination of "com," meaning "together," and "gnoscere," which means "to know" (derived from the Proto-Indo-European root “gno,” also meaning "to know"). The root “-scere” is linked to the Latin word “scire,” meaning “to know,” which is the basis for the term “science” as a form of knowing. Interestingly, it is also associated with “sciss,” meaning “to cut off” or “to incise,” as seen in the word “scissor.” Thus, cognition can be understood as the process of "knowing together what is cut into pieces," reflecting the act of integrating and comprehending information into a semantic whole.

The notion that cognition involves the integration of information is a common theme in modern literature. Interestingly, this understanding appears to have been intuitively grasped by people thousands of years ago.

For example, when you look at an apple, you perceive its shape, size, position, orientation, various colors, background, and, eventually, its motion in space. These features are not recognized as separate elements; instead, they come together in a nearly instantaneous cognitive act, leading you to simply state, “There is an apple.” A vast amount of information is seamlessly integrated into a single, unified conscious experience.

Another well-known example of information integration is illustrated in the figure below. At first glance, the image may seem like a collection of random, meaningless patches. However, it won't take long before a coherent scene almost suddenly takes shape in your mind. This leads to a sudden shift in perception, where a meaningful whole emerges.

The Dalmatian dog. Credit: A. Treisman [1]

This process may appear to us as an obvious and entirely normal cognitive function that requires no explanation. However, the mechanism behind this ‘integration’ or ‘binding’ remains unclear.

Firstly, while we know that the brain processes various features in distinct areas, we don’t know how it manages to combine these features into a single, unified mental object of perception. No ‘center of integration’ has been found. Numerous theories attempt to explain how feature integration occurs at the neurological level, but so far there are no conclusive answers.

Secondly, implementing such an integration or 'binding' in a computational process proves to be an extremely challenging task. It has taken decades of research to develop neural networks that can reliably perform pattern recognition and categorize it meaningfully. Even today, modern neural networks still do not achieve human-level generalization. While this does not imply they will never reach that level, it highlights how a cognitive activity we often consider trivial is, in fact, quite complex.

This issue is commonly referred to as the ‘binding problem.’

Furthermore, this binding is not confined to visual experiences; it is a pervasive process throughout all levels of cognition. Feature integration is a fundamental aspect of all our cognitive processes, functioning across a temporal dimension as well. For instance, consider how we combine a series of still frames to create a movie. Similar integration can be shown to take place at the level of auditory and tactile perception as well.

Another example is the meaning of a sentence extending beyond the individual words. The emphasis we place on certain words situates the sentence within a specific context that can change. Even with the same words and sentence structure, entirely different semantic meanings can emerge based on the emphasis. Consider these three sentences, where the words are emphasized differently.

1) Why did Tom betray Mary so suddenly?

2) Why did Tom betray Mary so suddenly?

3) Why did Tom betray Mary so suddenly?

Therefore, the emergence of meaning in our mind, is the integration of the apprehension of very diverse features, followed by an almost instantaneous cognitive act of comprehension. Once we bind, integrate, and coalesce all the elements, we ultimately perceive a single, coherent, meaningful whole, despite the presence of numerous words, subjects, objects, predicates, types of emphases, etc. This integration of information is at the very basis of our meaning-making.

In hindsight, it is not surprising that creating a theory of semantics that enables AI software to grasp meaning turned out to be an extraordinarily challenging task. Such a theory cannot simply combine individual words; it must also take into account the entire context, which encompasses the broader experiential world of the subject.

The meaning we derive from our sensory perceptions is not inherent in the external objects themselves. Physical information does not uniquely dictate semantic information. From the same set of physical information, one can create two distinct interpretations. Popular examples of this phenomenon include the Necker cube and the Rubin vase. In these cases, you can alternate between different interpretations: your mind can switch between two orientations of the cube or between viewing the vase and the two faces.

This shows that while the concept of meaning arising from context and a system of relations holds some truth, it is only partially accurate. Meaning is indeed relational and context-dependent, but this does not tell the entire story. Completely different meanings can emerge from the same context and information content. One figure, representation, pattern, or configuration can lead to multiple and mutually exclusive semantic interpretations. There is no reason to assume that our everyday sensory signals and brain activation patterns operate differently.

This should clarify that our mind does not create a 'representation' or a 'map' of the external world based on sensory perceptions; instead, it constructs an interpretation — essentially, a fabricated story.

Some people suggest that our brain generates internal, more or less approximate representations of the environment. However, it is unclear what exactly is meant by 'representation.' Moreover, which of the two cubes more accurately reflects reality? Is the vase or the two faces the better interpretation?

What we cognize is not an "approximation to truth," let alone a "vision of truth." We do not understand the world as it truly is. Instead, we select one of many possible interpretations or models based on appearances, which nonetheless helps us navigate through life. This idea has been recognized by philosophical idealism since the time of Plato. Kant argued that we can only know the phenomenon, not the noumenon. Some modern philosophers and scientists have taken up this idea.1

Thus, cognition extends far beyond merely collecting data. Sensory information alone does not equate to cognition; for instance, a hard drive can store all physical information about an environment without any actual understanding. Cognition occurs only when information is integrated and comprehended and experienced as a meaningful whole.

However, modern AI, such as large language models (LLMs), can be quite effective in simulating ‘understanding.’ A conversation with ChatGPT demonstrates how these machines appear to possess a semantic understanding of the world. In what ways does the ‘cognition’ of an LLM differ from that of a human being?

That’s a complex story that we can't fully explore here, but in Part II, we will examine some examples and compare machine cognition with human cognition.

Letters for a Post-Material Future

Discussion about this post