Supermodel Public API Explainer

The Supermodel API has nine public endpoints. Five of them return graphs. Four of them return things you do with graphs.

That split is deliberate. The graphs are the primitive. The analyses are applications of the primitive. Useful in their own right, but also a demonstration of what becomes easy once the graph exists. If you don’t see the thing you want on the analysis side, the graph is already there. Build it yourself.

This post walks through all nine. For each one: what it is, why we ship it, and what it’s good for.

The general shape of every endpoint is the same. You send a zipped repository. You get back a graph or an analysis. Large jobs return a 202 Accepted with a Retry-After and a job handle you can poll; small jobs return the result directly. Authentication is an X-Api-Key header.

Every request also takes an Idempotency-Key header: a string you choose that lets us deduplicate identical calls. Post the same key twice and you get the same job back instead of running it again. We recommend a content hash, usually the git commit SHA plus the endpoint name. For the Supermodel graph on next.js that looks like:

Idempotency-Key: nextjs:supermodel:a0376cf

If you only call one endpoint, make it POST /v1/graphs/supermodel. It bundles every primitive below into a single artifact, and it’s what our own internal tools consume by default. The rest of this page is reference for when you want a specific graph on its own. The full spec, including per-endpoint request/response schemas and an interactive playground, lives at docs.supermodeltools.com.

The primitives

Parse graph: `POST /v1/graphs/parse`

What it is. The lowest-level view of your code. We parse every source file with tree-sitter and emit the structural relationships: files contain symbols, symbols declare children, types extend other types, functions reference other functions by name. It’s the AST, flattened into a queryable graph instead of a tree you have to walk.

Why we ship it. Every analysis in this API starts here. Parse graphs are what you build on when you want to know not just “what calls what” but “what is what”: every class, function, type, constant, interface, with its position in the file and its relationship to the symbols around it. If you’re writing your own code intelligence tool, you should not be parsing source files yourself. You should be reading this.

What to do with it. Build your own symbol search. Build a custom “find all exports” pass. Layer your own reachability heuristics on top of the declarations we already resolved. Use it as the input to another analysis we haven’t written yet.

Dependency graph: `POST /v1/graphs/dependency`

What it is. File-level dependencies. Which file imports which file, across every language in the repo. Follows module resolution conventions per language so the edges actually mean something. The graph distinguishes local dependencies (files inside the repo) from external ones (third-party packages: npm, pip, go modules, crates) by node label: a file that imports lodash gets an edge into an ExternalDependency node; a file that imports ./utils gets an edge into a LocalDependency node pointing at another file. One graph, both worlds, one filter to split them.

Why we ship it. It’s the coarsest useful view of a codebase. Most architectural questions (“are these two subsystems actually separate?”, “what does this module depend on?”, “is this package a leaf or a hub?”) are questions about the dependency graph. The reason they feel hard to answer with grep is that they’re not string-matching questions. They’re graph questions.

What to do with it. Enforce layering rules. Find internal files that everyone depends on (the hubs you can’t change cheaply). Find files that depend on everyone (the integration layers). Filter to ExternalDependency nodes and you have a code-derived SBOM: which of your files actually pulls in which third-party package, not according to your lockfile, but according to the code. Render your architecture diagram from something real instead of something someone drew in 2023.

On next.js packages/ (2,308 files), the dependency graph comes back with 4,928 LocalDependency nodes and 361 ExternalDependency nodes, connected by 4,422 imports edges. Roughly a 14:1 ratio of internal to external dependencies. A number that tells you something real about the shape of the codebase.

Call graph: `POST /v1/graphs/call`

What it is. Function-level calls. Every resolved callsite from one function to another, across files and modules. Not just “file A imports file B”. Actually “function foo calls function bar, on this line, with this resolution.”

Why we ship it. The call graph is the thing every AI agent silently wants and doesn’t have. When an agent gets asked to modify a function, the first question it should ask is “who calls this?” The second is “what does this call?” Without a call graph, the agent has to reconstruct the answer with grep, one match at a time, and it will miss the ones that grep can’t see: method dispatch, re-exports, aliased imports. With a call graph, it’s a lookup.

What to do with it. Impact analysis. Dead code detection. Refactoring tools that know where the callers are. Pretty much any question that starts with “if I change this function…” is a call-graph query.

Domain graph: `POST /v1/graphs/domain`

What it is. A higher-level grouping of the codebase into domains and subdomains, the bounded contexts that would show up on a whiteboard if you asked the team to draw their architecture. The model is loosely based on C4, which defines four levels: System Context, Container, Component, Code. The four Supermodel graphs line up one-for-one with those levels. The whole codebase graph is the System Context. Domains are Containers, cohesive subsystems that could reasonably live in their own deployable. Subdomains are Components, cohesive groupings within a subsystem. Functions and classes from the parse and call graphs are Code. Same four levels C4 uses, computed from the source instead of drawn in a meeting.

The computation is a mix of structural and semantic signal. Our graph algorithms produce candidate groupings of nodes that make up the domains and subdomains. An LLM classification pass then names and describes each group, so you get back ProjectScaffolding and OptimizationService instead of domain_3 and domain_7. The output is hierarchical: domains contain subdomains, and subdomains contain the functions, classes, and files that belong to them. IDs line up across every graph in the API, so “show me the call graph restricted to the Auth domain” is a filter, not a separate request.

Inter-domain edges come back with semantic labels inferred from the code: coordinates_workflow_with, validates_input_for, transforms_data_for, monitors_health_of, with a generic DOMAIN_RELATES as the fallback. The intent is that the domain graph can be read straight into a diagram or straight into a prompt without a humans-only translation step in between.

Why we ship it. A call graph with 40,000 nodes isn’t legible. A domain graph with five to ten nodes is. The domain graph is what you hand to a human, or to an agent that’s about to write documentation, or to a reviewer who needs to know which subsystem a PR touches. It’s the zoomed-out picture, computed from the zoomed-in one so the two always agree.

It also solves the drew-it-once-never-updated problem. Most architecture diagrams live in a slide from 2022. This one regenerates from the code on every request, which means the picture of “what this system is” is always the picture of what it currently is.

What to do with it. Auto-generated architecture diagrams that stay honest. PR labels that say which domain changed, useful for routing review to the right team. Onboarding documents that don’t go stale because they’re regenerated from the code. A domain filter on every other graph query, so you can ask “show me the call graph for Auth” without wading through the rest.

Run it on the packages/ tree of next.js (2,308 files) and you get five domains (NextRuntime, ProjectScaffolding, OptimizationService, QualityControl, DeveloperTools) split across 11 subdomains, each with a description, a responsibility list, and the three or four files most central to it. Same input as the other graphs, legible architecture diagram out the other side.

Supermodel graph: `POST /v1/graphs/supermodel`

What it is. All of the above, bundled. The Supermodel Intermediate Representation (SIR) is a single artifact that contains the parse graph, dependency graph, call graph, and domain graph, cross-referenced and consistent, in one download. This is the endpoint to reach for by default. If you’re not sure which graph you need, you need this one.

Why we ship it. Most real tools want more than one of these at once. A dead code detector needs parse + call + entry points. An architecture doc generator needs domain + dependency. Fetching them separately means you pay for four analyses, you stitch them together yourself, and you hope the node IDs line up. The SIR is the version that’s already stitched.

What to do with it. Build the tool you actually wanted to build. The SIR is what our own internal analyses consume. If you’re doing anything non-trivial, start here.

The applications

These are four analyses we ship because we wanted them ourselves, and because each one is a worked example of what the graph is for. You can reproduce any of them from the graph primitives above. We ship them as endpoints because the common cases deserve a one-call answer.

Dead code analysis: `POST /v1/analysis/dead-code`

What it is. A ranked list of candidates for deletion. Symbols that are declared in the parse graph but unreachable in the call graph, starting from framework entry points (pages, controllers, route handlers, test files) and walking outward. Each candidate comes with a probability and a reason.

Why it’s an endpoint, not a recipe. Naive dead code detection is a bad experience. A call graph that doesn’t know about Next.js pages will tell you every page is unused. A parser that doesn’t know about barrel re-exports will flag every re-exported type. The endpoint is the version with those edge cases handled, so you get a list that mostly isn’t noise.

What to do with it. Run it in CI. Attach it to a PR bot that says “you added a function; here are three near it that we think nothing calls.” Feed it into an agent that’s about to write documentation, so the agent documents what’s alive.

We wrote about the benchmark results here. The short version: on the repo we measured most carefully, the graph-enabled agent was 30× cheaper in tool calls and 5× better at recall than the same agent with only grep.

Test coverage map: `POST /v1/analysis/test-coverage-map`

What it is. For every function in the codebase, whether it’s reachable from a test. Not “is this file imported by a test”. Actually, “does a test ever transitively call this function?” Computed from the call graph, with test files as roots.

Why it’s an endpoint, not a recipe. Coverage reports tell you which lines executed. This tells you which functions could execute from a test entry point. It’s a different question, and it’s more useful when you’re trying to decide what to write tests for, because it’s independent of whether anyone actually ran the suite. A function that’s not reachable from any test has no coverage no matter how high your line-coverage number is.

What to do with it. Prioritize where to add tests. Find the functions your critical paths go through that your tests don’t. Pair it with the impact endpoint to find high-blast-radius, low-coverage code: the parts most likely to ship a regression.

Circular dependency detection: `POST /v1/analysis/circular-dependencies`

What it is. All cycles in the dependency graph, found with Tarjan’s algorithm. Each cycle comes back as an ordered list of files, so you can see exactly which edges to cut to break it.

Why it’s an endpoint, not a recipe. You could run Tarjan’s yourself on the dependency graph. You probably shouldn’t; it’s a five-line function and we already wrote it. More importantly, circular dependencies are the kind of thing that sneaks in while no one is looking, so this is a CI-check endpoint, not a “I wonder if we have any” endpoint.

What to do with it. Fail a build when a new cycle appears. Gate merges on cycle count not increasing. When you do find cycles, treat the output as a list of refactoring targets ranked by how much of the codebase they tangle together.

Impact analysis: `POST /v1/analysis/impact`

What it is. Blast radius. Given a file or function, the transitive set of callers: everything that could break if you change it. Computed from the reverse call graph, with a depth cap and a grouping by domain so the answer is legible.

Why it’s an endpoint, not a recipe. This is the question an agent should be asking before every non-trivial edit, and it’s the question a reviewer should be asking before every approval. “This change touches 3 files” is meaningless. “This change touches 3 files and 127 callers across 4 domains” is the number you actually needed.

What to do with it. Attach it to your PR bot. Show the blast radius as a comment on every PR. Let your agent call it before it proposes a change so it knows whether it’s editing a leaf function or the thing under half the codebase. When an agent confidently refactors a function with 127 callers because it looked at 3 files, this is the endpoint that would have stopped it.

The point of the split

If you squint, the primitives and the applications do the same thing: they take your code and give you back a structured view of it. The difference is where the judgment happens.

On the primitive side, we make no decisions for you. We give you the graph as it actually exists in the code. What you do with it is your problem, and that’s the feature. If you disagree with our definition of “dead,” our definition of “blast radius,” our definition of “domain,” you can build your own version out of the graph and skip us entirely on that layer.

On the application side, we make the obvious decisions so you don’t have to. If you want dead code candidates, you want them with framework entry points handled, barrel re-exports handled, generated directories filtered. You don’t want to re-litigate those choices every time. The application endpoints are the version with the defaults that mostly work.

Both layers are real and both layers are supported. We’d rather ship a good application endpoint and a good primitive for the cases the application gets wrong than ship one without the other.

Against next.js

We pointed all nine endpoints at the packages/ tree of vercel/next.js (commit a0376cf, 2,308 files, 16MB zipped). Same zip, same API key, one call each.

Endpoint	Result
`POST /v1/graphs/parse`	19,445 nodes / 25,264 edges. 6,927 functions, 2,230 classes, 1,686 types, 361 external packages.
`POST /v1/graphs/dependency`	7,976 nodes / 4,422 imports. 4,928 `LocalDependency` : 361 `ExternalDependency` (≈14:1).
`POST /v1/graphs/call`	3,668 functions / 5,943 resolved calls.
`POST /v1/graphs/domain`	5 domains, 11 subdomains. 1,571 files, 6,999 functions, 2,170 classes assigned.
`POST /v1/graphs/supermodel`	19,463 nodes / 41,791 edges. All of the above, cross-referenced in one artifact.
`POST /v1/analysis/dead-code`	1,876 candidates across 11,248 declarations (~80s).
`POST /v1/analysis/test-coverage-map`	12.8% test-reachable coverage. 877 tested functions, 5,973 untested.
`POST /v1/analysis/circular-dependencies`	9 cycles, 321 files involved, 4 high-severity.
`POST /v1/analysis/impact` (targeted)	Top dependents for `packages/next/src/server/next.ts`: 71. Repo-wide top is `taskfile.js` at 145.

One note on impact: calling it without targets or a diff asks for a global coupling map, and on a repo the size of next.js the response blows past the payload limit. That’s the correct behavior. The useful question on a large repo isn’t “give me the entire blast radius of every file,” it’s “what breaks if I change this?” Scope the call with a diff or a target list and it comes back in about a minute and a half.

Try it

Every endpoint takes the same input: a zipped repository.

cd /path/to/repo
git archive -o /tmp/repo.zip HEAD

curl -X POST "https://api.supermodeltools.com/v1/graphs/supermodel" \
  -H "X-Api-Key: $SUPERMODEL_API_KEY" \
  -H "Idempotency-Key: $(git rev-parse --short HEAD)" \
  -F "file=@/tmp/repo.zip"

Swap /v1/graphs/supermodel for any of the eight other paths above and the call is identical.

Full reference lives at docs.supermodeltools.com. The CLI wraps all of this for the live-update workflow:

npm install -g @supermodeltools/cli
supermodel watch

We maintain the graphs. You build the tools.

The primitives

Parse graph: POST /v1/graphs/parse

Dependency graph: POST /v1/graphs/dependency

Call graph: POST /v1/graphs/call

Domain graph: POST /v1/graphs/domain

Supermodel graph: POST /v1/graphs/supermodel