ai-frameworks

LangChain Built a Database Because Agent Traces Stopped Looking Like Logs

Anatoliy Kolodkin

27 May 2026 • 5 min read

LangChain did not build SmithDB because databases are trendy. It built SmithDB because agent traces stopped behaving like normal logs.

That distinction matters. The obvious version of this story is “LangChain made LangSmith faster.” True, but incomplete in the way a passing unit test can still hide a broken architecture. The more important signal is that production agents are generating a kind of operational record that generic observability stores were not designed to serve: deeply nested spans, partially arriving events, long-running executions, huge JSON payloads, multi-modal artifacts, tool calls, evaluator feedback, thread reconstruction, and queries that ask not merely “what happened?” but “which branch of this agent’s behavior poisoned the outcome?”

SmithDB is LangChain’s answer to that workload. According to LangChain’s primary announcement, the new Rust data layer now backs 100% of LangSmith US Cloud ingestion and 100% of tracing UI query traffic. That includes major workflows such as metadata filtering, feedback filtering, text search, tree filters, trace filters, thread filtering, and aggregations over cost, latency, token usage, and evaluator scores. LangChain says core LangSmith experiences are now up to 12x faster than before.

The latency numbers are not decorative. LangChain reports trace tree loads at P50 92ms and P99 595ms; single run loads at P50 71ms and P99 358ms; run filtering at P50 82ms and P99 434ms; trace ingestion at P50 630ms and P99 1.47s; full-text search at P50 400ms and P99 870ms; and thread filtering at P50 131ms and P95 268ms. Those figures matter because agent debugging is often interactive. If every filter takes seconds, the team stops asking good questions and starts screenshotting vibes.

The trace is becoming the product surface

The architectural choices are telling. SmithDB is written in Rust, uses Apache DataFusion as the query engine, builds on the Vortex file toolkit, stores durable trace data in object storage, keeps a small Postgres metastore, and runs stateless ingestion, query, and compaction services. Under the hood, LangChain describes an object-storage-backed LSM, progressive querying over object storage for Top-K-style newest-run queries, direct reads from ingestion-node SSD and memory cache for fresh data, event-sequence modeling for long-running runs, time-tiered compaction, deletion and upgrade vectors for immutable files, late materialization for large JSON fields, and object-storage-optimized inverted indexes for full-text and JSON key-path search.

That is a lot of database machinery for what many teams still call “logs.” It is also exactly the point. Modern agent traces can contain hundreds of deeply nested spans. A span start event may arrive minutes or hours before the end event. Payloads increasingly include images, audio, long tool outputs, structured objects, and model messages whose meaning depends on where they sit in the execution tree. A normal APM trace is usually a bounded request. An agent trace is closer to a behavioral record of a distributed workflow that happens to speak natural language.

This is where generic observability abstractions start to crack. In a web service, you usually ask which request was slow, which dependency failed, and which deployment caused the regression. In an agent system, you ask which tool call changed the plan, which retry branch burned the budget, which evaluator score dropped after a prompt edit, which human approval resumed the run, which retrieved document caused the hallucination, and whether the same user-visible failure shares a hidden path through the graph. Those are trace-shaped questions, but not traditional tracing questions.

The customer quotes in LangChain’s announcement point to the same pressure. Clay says it logs “hundreds of millions of agent observability events” to LangSmith every day. Cogent Security says SmithDB showed traces “in seconds instead of minutes” compared with other providers. Unify frames the value around making large tool-call traces easier to query and read across projects. None of that sounds like dashboard polish. It sounds like teams trying to make agent behavior inspectable before it becomes operationally unmanageable.

Framework vendors are moving down-stack

The competitive implication is easy to miss. LangChain started as an application framework. LangSmith became the companion surface for tracing, evaluation, deployment, and debugging. SmithDB moves another layer down: storage and query infrastructure built specifically for agent traces. That is what platform companies do when a generic substrate cannot carry the workload they need to make their higher-level product credible.

This does not mean every team should build a bespoke database for agents. Please do not turn the Monday architecture meeting into a Rust storage-engine pitch unless you enjoy paperwork. The practical lesson is narrower and more useful: agent observability has its own query patterns, and teams should evaluate vendors against those patterns instead of checking a box labeled “tracing.”

If you are adopting an agent platform, ask whether it can reconstruct threads across runs. Ask whether it can search inside large tool payloads without turning the UI into a loading spinner. Ask whether it preserves partial events and replayable state transitions. Ask whether it can aggregate cost and evaluator scores across arbitrary filters. Ask whether it can answer what happened when a span started an hour before it ended. Ask what gets redacted, what gets retained, what can be self-hosted, and what happens when a customer asks for deletion across immutable object-storage files.

The self-hosting question is especially important. LangSmith is an enterprise product, and enterprise customers will care deeply about where traces live, how retention works, and whether observability data can stay in their own cloud. Object storage plus stateless ingestion and query services is a plausible answer because it is more portable than a bespoke local-disk cluster. But LangChain currently says self-hosted SmithDB is coming “soon.” Until that ships and proves boring to operate, the portability story is an architecture promise backed by strong cloud production evidence, not yet a finished enterprise deployment model.

There is also a governance angle. Once traces become the behavioral record of the application, the database behind those traces becomes part of the runtime’s trust boundary. Agent traces contain prompts, user inputs, tool outputs, credentials-adjacent metadata, business decisions, intermediate reasoning artifacts, and sometimes customer data that was never meant to become observability exhaust. Fast search is useful. Fast search over over-collected sensitive traces is how debugging turns into discovery risk.

For practitioners, the action item is not “switch to SmithDB.” It is to update the observability checklist. Treat agent traces as production evidence, not debug leftovers. Define retention rules. Make deletion real. Capture tool calls, retrieved context, evaluator outputs, token usage, cost, approvals, and state transitions in a queryable form. Test trace reconstruction as part of release readiness. If your agent cannot explain its own path through the system, you do not have production observability. You have a transcript with ambitions.

SmithDB is interesting because it makes the hidden infrastructure requirement explicit. Production agents are forcing new primitives below the framework layer: durable execution, replay, event logs, trace databases, policy-aware tooling, and cost-aware inspection. The industry keeps trying to sell agents as a model upgrade. The teams operating them are discovering they are a systems problem. LangChain just put a database-shaped receipt on the table.

Sources: HackerNoon, LangChain, Apache DataFusion, Vortex

The trace is becoming the product surface

Framework vendors are moving down-stack

Sign up for more like this.