Why your coding agent should remember

April 2026

Close your terminal. Open it again. Ask your coding agent what it was doing five minutes ago.

It has no idea.

It doesn't remember the architecture decision you debated for twenty minutes. It doesn't know that the PostgreSQL migration failed on the users table last session, or that you settled on a particular test pattern three days ago. Every session is session zero. You are perpetually onboarding your own tool.

This is the state of AI coding agents today. They are extraordinarily capable within a single conversation and completely amnesiac across conversations. The moment you close the terminal, everything the agent learned about your project, your preferences, and your codebase evaporates. The next time you open it, you start over.

We've been treating this as normal. It isn't. It's a design choice, and it's the wrong one.

The disposable session problem

Consider what actually happens when you work with a coding agent. The first few minutes of every session are wasted on orientation. You explain the project structure. You point out the relevant files. You mention that this particular module has a quirk, that the tests need a running database, that the API client lives in an unexpected directory.

The agent absorbs all of this, does good work, and then you close the terminal. Tomorrow, you do it again. The same explanations. The same pointing. The same slow ramp-up.

Some agents have made gestures toward solving this. Chat history you can scroll through. Project files that inject context. But these are patches on a fundamentally disposable architecture. The session is still the unit. When it ends, coherence ends with it.

The problem isn't that agents are forgetful. The problem is that forgetting is baked into the architecture. Sessions are designed to be stateless. Context windows are treated as scratch space that gets wiped. There's no mechanism for the agent to carry forward what it learned, because the system never expected it to.

And this creates a ceiling. No matter how smart the model gets, if it can't remember what happened yesterday, it can't build on yesterday's work. Intelligence without continuity is intelligence on a treadmill.

Persistence as a system, not a feature

When we built ag, we didn't add memory as a feature. We built the entire agent around a thesis: persistence is the core capability, not an add-on.

That thesis produced four mechanisms that work as a unified system:

Memory operates in three tiers. Global memory stores your preferences and patterns across all projects. Project memory stores architecture decisions, tech stack context, and conventions for a specific codebase. Session history preserves the raw conversation for replay and debugging. All three are plain Markdown files you can read and edit directly. They're injected into every system prompt, so the agent starts every session knowing what it learned in the last one.

Checkpoints snapshot both conversation state and file state at every turn. This isn't version control—it's undo for the entire interaction. Made a wrong turn three steps back? Rewind to that checkpoint. The files roll back. The conversation rolls back. You don't lose the work that came before, and you don't have to explain what happened. The agent picks up exactly where the checkpoint left off.

Result refs solve a problem that becomes critical in long sessions: context window bloat. When the agent runs a tool that produces a large result—a grep that returns 200 lines, a git diff spanning several files—that output goes into the context window and stays there, taking up space that could hold actual reasoning. Result refs cache these outputs to disk the first time they appear, then replace them with compact summaries on subsequent API calls. If the agent needs the full content later, it pulls it back from the cache. The context window stays clean without losing access to anything.

Turn summaries compress each turn into a structured record: what was attempted, what happened, which files were touched, what errors occurred, what decisions were made. When the conversation gets long, these summaries preserve the signal while discarding the noise. File paths survive. Error messages survive. The reasoning chain survives. What doesn't survive is the verbose tool output and conversational filler that took up 80% of the tokens.

These four mechanisms layer on top of each other. Memory handles cross-session persistence. Checkpoints handle within-session safety. Result refs handle within-turn efficiency. Turn summaries handle cross-turn compression. Together, they form a complete persistence stack.

The compound effect

Something interesting happens when your agent actually remembers.

Session two is faster than session one. The agent already knows your project uses a monorepo with shared types in /packages/shared. It knows your tests need DATABASE_URL set. It knows you prefer explicit error handling over try-catch-swallow patterns.

Session five is faster still. The agent has seen you solve three similar problems. It's watched you refactor a service module and remembers the pattern you converged on. It knows which directories are hot and which are stable.

By session twenty, the agent has a mental model of your project that would take a new team member weeks to build. Not because it was explicitly taught, but because it was there. It accumulated context through the natural process of doing work.

This is the compound effect of persistence. Each session deposits a thin layer of knowledge. Over time, those layers stack into something that feels less like a tool and more like a collaborator who's been on the project since the beginning.

Without persistence, every session is equally expensive. The twentieth session costs the same in ramp-up time as the first. You're not building anything—you're renting. With persistence, you're investing. The agent gets better at your project specifically, not just at coding in general.

What this isn't

ag is not trying to be the smartest model. It's model-agnostic—it works with Claude, GPT-4, Gemini, local models through OpenRouter, or whatever you prefer. The intelligence comes from the model. What ag provides is continuity.

It's not trying to be the prettiest IDE integration. It's a terminal tool. It's approximately 130 kilobytes with zero production dependencies. You can read the entire source in an afternoon.

And it's not trying to be a platform. There's no cloud service, no SaaS subscription, no telemetry phoning home. Your memory files are Markdown on your disk. Your checkpoints are files in your project directory. Everything is local, inspectable, and deletable.

The bet is simple and specific: the bottleneck in AI-assisted coding isn't model intelligence. It's amnesia. The models are smart enough. What they lack is the ability to carry forward what they've learned. Fix that, and everything else—the speed, the accuracy, the usefulness—improves as a consequence.

ag isn't a smaller Claude Code or a bigger pi. It's a different bet about what coding agents are for.

Most tools in this space are optimizing for the single session: faster responses, better tool use, smarter reasoning within one conversation. Those are good things. But they're optimizing the wrong unit. The unit that matters isn't the session. It's the project. And projects unfold over weeks, months, years.

An agent that treats each session as disposable can never truly know your project. An agent that persists can.

ag is open source under Apache 2.0. Install it with npx @elementics/ag and start a session that actually remembers the last one.

Install ag · Read the docs · View on GitHub

← Back to ay-gee.com