Context Graph, AI-driven Automation & Hallucinations

There has been a lot of buzz, talks, discussions and interest about “Context Graph” in the last 3-4 months. No event, panel discussion or conference is complete without “Context Graph”.

What is Context Graph

Your enterprise runs on two kinds of knowledge. The first kind lives in your systems of record: Salesforce records the deal. Workday records the raise. SAP records the order. The second kind NEVER gets captured anywhere. Things like what data was pulled, what policy applied, who approved, what exception was granted, and how were these actually implemented. It lives in people’s heads, MS Teams threads or Zoom calls.

Now AI agents are being deployed to automate real workflows: RFP responses, resolving escalations, tech documentation, retention campaigns, quote approvals, compliance reviews etc. And 80% of these keep hitting a wall because these models can’t access that second kind of knowledge and have been trained ONLY on the first kind of knowledge. They only see the outcome (the discount was 25%), never the reasoning (three critical incidents + a VP approval + a precedent from Q2). So, when forced, AI systems end up fabricating sequences to complete the orchestration.

So the solution is to have something thatrecords decisions and the context behind those decisions. That something precisely is Context Graph. It’s a new layer of enterprise software that sits in the AI agent’s execution path and records every decision as a structured artifact. Over time, these cross-system “decision traces” accumulate into a searchable, auditable graph of how your company actually makes decisions. In a way Context Graph is a system of records of decisions and for decisions.

But What Exactly is Context Graph?

Technically, it’s not one thing. It’s best understood as a combination of three layers working together.

Layer 1 – Event log (immutable records of decisions)
Every time an AI agent (or human) makes a decision, you write an immutable record: what happened, who approved, what data was used, what exception was granted. These are often emitted from agent orchestration layers and stored in logs, event streams (e.g., Kafka), databases (e.g., Postgres), or document stores. This is just structured data sitting in a table or a stream. It’s essentially turning actions into a durable “system of record for decisions” rather than just transactions or final states.

Layer 2 – Graph database (connecting relationships)
This is where raw logs get transformed into a knowledge graph (or property graph) using tools like Neo4j or Amazon Neptune. Nodes might represent deals, VPs, policies, incidents, or past decisions; edges capture relationships like “overrode,” “linked to,” “influenced by,” or “similar to.” This enables multi-hop traversal and complex queries that SQL struggles with e.g., finding patterns across entities, time, and exceptions. The graph turns isolated records into traversable, relational institutional memory.

Layer 3 – Retrieval + Reasoning layer (reasoning substrate)
This is the key differentiator that makes it more than “just a database” or “just a knowledge graph.” When an AI agent is about to make a new decision, it queries the graph using vector search & graph traversal to find relevant past decisions. The retrieved subgraph or context is then fed to the LLM for grounded reasoning. The LLM then reasons with that retrieved context, not from scratch. This is the part that makes it more than just a database because agent learns from organizational precedent and take right steps.

Does it Solve Hallucination?

The true value of context graphs lies in their ability to capture the “how” of work, the observable digital trail of actions, collaborations & decisions and infer the “why” from patterns over time. So, a context graph does reduce a specific class of hallucination, by providing this decision-specific context to prevent AI from filling the gap with plausible-sounding fabrication. If the agent can actually look up “what did we do for a Oil & Gas customer in Q2 and why,” it is grounding its recommendation in the decision-fact rather than confabulation.

But decision traces are hybrid – structured metadata around a natural language core. The structured shell is machine-written, by the agent, at decision time. Unambiguous. Queryable. This part is reliable. The natural language interior is where accuracy degrades. That field is the weakest link in the entire concept. It introduces a new hallucination surface.

The agent must now abstract reason about past decisions described in prose and LLMs are quite good at finding “support” for whatever conclusion they’re already leaning toward. A bad actor (or a poorly prompted agent) can find a precedent in the graph and use it selectively, or misread the conditions under which it was applied. Or they might compound the wrong behavior confidently using bad precedents.

The deeper issue is that hallucination is partly a confidence-calibration problem, and a Context Graph doesn’t fix that. An agent that retrieves a genuinely relevant precedent will cite it correctly. But an agent that retrieves a superficially similar but actually different precedent may cite it with equal confidence and now the hallucination is laundered through what looks like institutional memory. That’s going to be harder to catch than a raw hallucination.

Advantage CohGent

There are three points that have come up in our discussions with our customers in this context.

One, enterprises are betting their AI future on institutional memory but no one is asking whether that memory is worth trusting. Two, the “why” layer of context graph where all the governance value lives, is still structurally weak and (also) is captured in natural language. Three, the bad precedent compounded at scale by AI is worse than no precedent at all. CohGent sits exactly at the point where all three of those vulnerabilities converge.

Look, every context graph vendor (if any), every RAG pipeline, every AI agent is making a silent and dangerous assumption: that the knowledge being fed in is coherent, consistent and reliable. But the Enterprise documentation, as is, is significantly hallucinogenic. CohGent is the quality gate that sits before every one of those pipelines. It ensures the enterprise knowledge the agent reasons from is coherent before it ever touches the agent. The bigger the AI deployment, the more critical CohGent becomes. Because bad knowledge compounded at scale by AI is a liability that compounds with every decision the agent makes.

What Foundation Capital Got Wrong?

The thesis is genuinely compelling. But their core argument that incumbents cannot succeed in owning the context graph, rests on the assumption of “political neutrality”. This implies that only startups, unburdened by entrenched interests, can earn the trust of enterprises to sit across all systems simultaneously and act as neutral orchestration layer.

Now, think about what the context graph actually is, once it’s mature. It’s not just logs. It’s the complete, queryable map of how your company actually makes decisions. Every exception pattern, every approval threshold, every place where your employees deviate from policy, every competitive sensitivity embedded in your pricing logic, every pattern of who really has power versus who has the title. That is arguably the single most sensitive asset an enterprise possesses. More sensitive than your customer list. More sensitive than your financials. Because it reveals not just what you do, but how you think and where you’re weak.

Now ask the question plainly:

Would you hand that most valuable asset to a 3-year-old startup with 40 employees, a Series B, and no guarantee they exist in 5 years?

Or would you hand it to SAP or Salesforce who has been in your data center for 20 years, has signed enterprise agreements with liability clauses, has a legal entity you can sue, has survived multiple recessions, and has a brand that their own CEO’s job depends on protecting?

The startup neutrality argument is a VC narrative dressed up as strategy. Typical of “standard VC writing” what Foundation Capital has done is to take a compelling macro thesis about a missing enterprise layer, and then retrofit their existing portfolio companies as examples of it.It makes sense on a whiteboard, “no competing interests, pure broker”, but it completely ignores how enterprise trust actually works. Understandably, Foundation Capital cannot say because it would undermine their entire fund thesis.