🦸🏻#9: Does AI Remember? The Role of Memory in Agentic Workflows

Community Article Published February 2, 2025

we explore SOAR’s legacy, memory types, generative workflows, memory mode in LLM, and AI influence on memory itself

If we talk for too long, I'll forget how we started. Next time I see you, I'm not gonna remember this conversation. I don't even know if I've met you before.” – Leonard Shelby “Memento” or… basically any LLM

Memory – or more precisely, memories – is a key building bloc of an agentic workflow, closely associated with knowledge and profiling. But its deserves its own spotlight because it operates at a different level of granularity and function than “knowledge” and “profile.” While profiling defines how an agent interprets who it is (its character, it’s “avatar”), what it does (its behavior models), and where it operates (its environment), and while knowledge provides the facts or learned representations that guide decisions, memory is the dynamic record of experience that threads these elements together and actively participates in decision-making. Memory has been studied for decades, yet we still don’t fully understand how to make LLMs remember things consistently. Current AI systems can retrieve information, summarize past interactions, or even store selective details, but they lack a stable, structured memory that persists reliably over time. Today, we have a lot on our plate: we will explore a forgotten paper that may offer insights from the past, explain the different types of memory and their roles in agentic workflow, learn how the components come together in practice, clarify how models with memory mode “remember” things, and ask ourselves: how generative AI is transforming the nature of memory itself. Let’s start.


🔳 Turing Post is on 🤗 Hugging Face as a resident -> click to follow!


What’s in today’s episode?

We apologize for the anthropomorphizing terms scattered throughout this article – let’s agree they are all in ““.

SOAR’s Legacy in Agentic Memory Systems: A Bridge from Cognitive Models to AI Agents

image/png

In 1987, Allen Newell, Paul Rosenbloom and John Laird proposed SOAR – an architecture for general intelligence. Today we continue to lock horns over the definition of general intelligence, but the authors of SOAR had a clear view: they meant that general intelligence is the ability of a system to handle a full range of cognitive tasks, employ diverse problem-solving methods, and continuously learn from experience.

When the SOAR architecture was introduced, it was a bold attempt to create a unified theory of cognition, blending problem-solving, learning, and memory into a single framework. SOAR's solution was elegant and introduced a structured approach to memory that resonates with modern AI agent architectures. By distinguishing between working memory (for immediate cognitive tasks) and long-term procedural memory (for learned rules), SOAR anticipated the challenges of building systems that retain, recall, and refine knowledge over time. While modern agentic AI relies more on statistical learning and vector-based retrieval than explicit production rules, the fundamental question of how systems remember and improve remains central – making SOAR a relevant conceptual ancestor to today's AI frameworks.

image/png Image Credit: The original paper

Declarative and Procedural Knowledge

One of SOAR’s key innovations was distinguishing between two types of knowledge. Declarative knowledge, consisting of facts and information, is held in working memory and represents the system’s current understanding of its environment. Procedural knowledge, in contrast, is embedded in long-term memory as production rules that dictate the system’s actions. This clear separation enabled SOAR to manage immediate problem-solving tasks while building a lasting repository of strategies for future use.

Another hugely important feature was →

Chunking

When the system successfully solves a problem, it consolidates that experience into a new production rule. This “chunking” process effectively compresses a complex problem-solving episode into a reusable piece of knowledge, thereby reducing future computational load and enhancing efficiency. By internalizing successful strategies, SOAR continuously refines its problem-solving capabilities, much like how humans learn from repeated experiences.

Subgoaling and Hierarchical Problem-Solving

Another profound aspect of SOAR is its use of automatic subgoaling. When SOAR encounters an impasse – a situation where its current knowledge proves insufficient – it generates a new subgoal to overcome the obstacle. This mechanism of breaking down complex problems into simpler, more manageable parts resembles the hierarchical problem-solving methods seen in human cognition. The concept of subgoaling in SOAR has influenced later developments, particularly in fields like hierarchical reinforcement learning and multi-agent coordination frameworks.

SOAR’s Legacy and Its Resonance with Modern AI

The structured approach of SOAR marked a departure from earlier, fragmented models of cognition. By integrating working memory, long-term memory, and learning, it became a cornerstone of cognitive architectures and influenced approaches to general intelligence. Today, AI systems – driven by deep learning, LLMs, and reinforcement learning – face challenges that echo SOAR’s original questions about memory, learning, and problem decomposition.

Certain modern AI techniques resemble aspects of SOAR’s subgoaling mechanism, particularly in hierarchical planning and task decomposition. Similarly, methods like fine-tuning, continual learning, and retrieval-augmented generation share SOAR’s goal of leveraging past experiences to improve performance, though their mechanisms differ.

SOAR’s structured approach to declarative and procedural knowledge foreshadowed modern neuro-symbolic AI, which seeks to combine symbolic reasoning with neural adaptability. This synthesis underscores the enduring relevance of structured memory and dynamic learning in the pursuit of general intelligence.

While deep learning once overshadowed SOAR’s structured approach, AI researchers are now revisiting many of its core ideas. As AI agents struggle with memory, retrieval, and adaptability, SOAR’s architecture appears less like a relic and more like a precursor to the next wave of autonomous AI.

Another important work, influenced by Allen Newell’s work was ACT–R architecture and based on that An Integrated Theory of the Mind by John R. Anderson and Daniel Bothell. We will not explore it in this article but you can find a link to that work in the Resources. 

Types of Memory that We Operate with Today

The way AI agents handle memory today isn't a monolithic process – it’s a structured system composed of different layers, each serving a unique purpose. Some memories persist over time, shaping long-term behavior, while others are fleeting, used only for the immediate task at hand.

Long-Term Memory: The Foundation of Persistent Knowledge

image/png

At the core of long-term memory are two distinct types: explicit (declarative) memory, which involves structured, retrievable knowledge, and implicit (non-declarative) memory, which enables learning from past experiences.

Explicit memory is what allows AI to recall facts, rules, and structured knowledge. Within this category, semantic memory is responsible for storing general truths and common knowledge. This is why an AI system can confidently state, “The Eiffel Tower is in Paris,” or “Dogs are mammals.” This type of memory provides the foundation for knowledge-based AI applications, such as search engines and chatbots.

Then there’s episodic memory, which is more personal – it captures specific events and experiences, allowing an agent to remember context from past interactions. If a customer service AI recalls that a user previously requested a refund, it can tailor its responses accordingly, making interactions feel more intuitive and human-like.

In Memento, Leonard Shelby’s struggle is one of episodic memory loss. He remembers facts about his life before his injury (which relates to semantic memory), but he cannot store new episodic memories, meaning every new interaction or event fades within minutes. His reliance on notes, Polaroid pictures, and tattoos mirrors an externalized, makeshift memory system – attempting to compensate for his inability to encode new personal experiences. Even with memory features, LLMs don’t store true episodic memories – they retrieve patterns and summarize past interactions instead.

Implicit memory, on the other hand, is what allows AI to develop instincts. It’s driven by procedural memory, which helps an agent learn skills without requiring explicit recall. Think of a self-driving car that improves its lane-keeping ability after thousands of miles of training. The car doesn’t need to “remember” every scenario explicitly – it develops an intuitive understanding of how to navigate roads. When you actually experience that – it’s quite incredible.

Short-Term Memory: The Power of the Present Moment

image/png

While long-term memory enables growth and adaptation, short-term memory ensures agents stay responsive in real-time interactions.

The context window defines how much past input an AI model can retain within a single exchange. This limitation is crucial in LLMs – give an AI a tiny context window, and it will forget what you said just moments ago. Expand that window, and it can maintain continuity over longer conversations, making responses more coherent and natural. A lot of current research focuses on expanding and optimizing context windows. We’ll explore the most intriguing developments in future episodes.

Then there’s working memory, which plays a vital role in multi-step reasoning and decision-making. Just as humans use working memory to hold several ideas in mind at once – like when solving a math problem – AI agents rely on it to process multiple inputs simultaneously. This is especially important for complex tasks like planning, where an agent needs to balance different pieces of information before reaching a decision.

Bringing It All Together

image/png Image Credit: Richardson Gunde

The interplay between these different types of memory is what makes modern AI agents increasingly effective. Long-term memory allows them to learn from the past, short-term memory ensures they stay engaged in the present, and working memory enables them to process multiple inputs at once. Together, these components shape an AI’s ability to act autonomously, adapt intelligently, and provide more meaningful interactions over time.

As AI systems continue to evolve, refining how they manage memory will be key to unlocking more advanced agentic workflows – ones that feel more natural, capable, and, ultimately, more intelligent.


Now, let’s be more specific. Understanding how AI agents handle memory is crucial, but how do these components come together in practice?

Memory and Generative Agents

One of the most compelling examples of memory-driven agentic behavior is found in the paper Generative Agents: Interactive Simulacra of Human Behavior by researchers from Google Research and Stanford. In this work, memory plays a crucial role in enabling agents to simulate believable, human-like behavior. The proposed architecture incorporates memory as a dynamic component, allowing agents to observe, store, retrieve, and synthesize experiences over time to guide their interactions and decision-making.

image/png Image Credit: The original paper

The memory system here is structured as a memory stream, where agents continuously log their experiences in natural language. These memories are not static but are periodically retrieved and synthesized into higher-level reflections, allowing agents to draw broader conclusions about themselves, others, and their environment.

Memory retrieval is governed by three key factors:

  • recency (recent memories are more accessible),
  • importance (highly significant events are prioritized),
  • relevance (only contextually relevant information is surfaced for decision-making).

Reflection enables agents to generalize from their experiences, forming insights that influence future behavior. For example, an agent repeatedly working on a music composition may develop a self-perception as a passionate musician. This process enhances long-term coherence, helping agents behave in ways that align with their past interactions and evolving relationships.

Planning further integrates memory by allowing agents to anticipate future actions based on their prior experiences. Agents generate daily schedules, which they refine into detailed action sequences, recursively adjusting them based on new observations.

The study demonstrates that this memory-driven approach leads to emergent social behaviors, such as information diffusion, relationship formation, and coordinated activities. However, limitations remain, including potential memory retrieval failures, hallucinations, and biases inherited from the underlying language model.

Ultimately, memory serves as the foundation for agent believability, enabling nuanced, dynamic interactions that go beyond single-time-point language model outputs, making generative agents capable of simulating social behavior in interactive applications. Fascinating!


Speaking about memory in agentic workflows, we can’t miss our modern everyday experience – and more precisely, how in the world does ChatGPT remember if your child is six years old if you haven’t explicitly told it?! And even if you have, how does it manage to remember, and where is this information stored?

How ChatGPT “Remembers” Things: Understanding Memory Mode

Most AI chat models have the memory of Leonard Shelby “Memento – once a conversation ends, everything resets. That’s fine for quick Q&A, but frustrating when you want continuity. Or when you want it to remember your writing style. Memory mode changes this by letting ChatGPT retain key details across sessions, making interactions feel more like talking to an assistant that actually knows you. It’s both creepy and makes you feel heard. It also makes your interactions with the AI assistant shorter.

How It Works

With memory enabled, ChatGPT doesn’t store full conversation logs but instead extracts key facts and patterns. Say you frequently mention that you are writing a book on citizen diplomacy – rather than remembering every instance, it might store “User is interested in citizen diplomacy and is writing a book about it.” That way, next time you bring it up, the model doesn’t start from zero.

It’s also selective – it won’t remember everything you say, just what’s repeated or explicitly confirmed (e.g., “Remember that I’m working on a news digest”). This keeps memory clean and relevant.

Where Is Memory Stored?

Not inside the model itself. Instead, summarized data is stored securely on OpenAI’s servers, using vector embeddings – compact numerical representations that can be retrieved efficiently. When you start a new session, the system searches for relevant past data and integrates it into the conversation, giving the illusion of continuity.

The developers promise that memory mode isn’t a black box. Sometimes you have doubts about it and it feels awkward how much your chat remembers about you. But you can always go and review, update, or delete stored information in your chat. In any case, data is stored in an abstracted form, meaning no full transcripts – just key insights.

Concluding Thoughts: AI Influence on Human Memory

This is not a conclusion, but rather some food for thought. While working on this article, I came across a paper that is not directly connected to memory in agentic systems but offers a fascinating and somewhat unsettling perspective on how generative AI is transforming the nature of memory itself. In AI and Memory, Andrew Hoskins argues that AI does not simply extend human memory or aid in recollection; rather, it untethers memory from its traditional constraints, creating what he calls a "third way of memory." In this paradigm, memory is no longer an act of retrieval but an ongoing process of reconstruction, where pasts that were never actually experienced are generated, modified, and made available as though they were real.

What struck me in his argument was how AI constructs what Hoskins terms a "conversational past" – an ever-evolving digital representation of memory that exists independently of human agency. Through LLMs and AI-driven services, past events are continuously reinterpreted and remixed in ways that blur the boundary between what was once lived and what is now artificially manufactured. This is particularly evident in the rise of AI-generated "deadbots," which allow for interactive experiences with digital versions of the deceased, raising ethical and philosophical questions about consent, authenticity, and the permanence of digital legacies.

Beyond individual memory, Hoskins explores the broader implications of this AI-driven transformation for collective historical narratives. With AI reshaping how societies record, remember, and even forget, traditional markers of memory – such as archival records, personal recollections, and oral histories – are increasingly at risk of being supplanted by AI-generated alternatives that may lack grounding in lived experience. He warns that as AI reconstructs and repurposes the past in ways that were never consented to, human agency over memory is gradually eroded.

While his article focuses on the sociocultural aspects of AI and memory, it raises important questions that resonate with discussions of memory in agentic systems. Just as AI is transforming human memory, it is also redefining how autonomous systems store, retrieve, and utilize knowledge. If AI models are capable of generating memories rather than merely retrieving them, what does this mean for agentic workflows that rely on past interactions to inform future decisions? How do we differentiate between an AI's learned experience and an AI-generated reconstruction of past events? These are critical considerations as we explore the role of memory in AI-driven systems, where the distinction between stored knowledge and dynamically created pasts may become increasingly blurred. This invites further reflection on AI’s growing role in shaping collective memory.

“Memory can change the shape of a room; it can change the color of a car. And memories can be distorted. They're just an interpretation, they're not a record, and they're irrelevant if you have the facts.” – Leonard Shelby “Memento”… or AI?


📨 If you want to receive our articles straight to your inbox, please subscribe here


Resources that were used to write this article

Sources from Turing Post

Community

Sign up or log in to comment