Topos¶

Code quality evaluation

Structural code quality your agents can measure — and optimize toward.

Stop paying agents to rediscover and repair their own structural mess. Topos gives coding agents concrete quality targets before complexity, coupling, and risky data paths compound into expensive context archaeology. Pick a preference ranking and Topos measures program structure — not just syntax — so agents can optimize toward the code shape you want on every pass.

Get started → View on GitHub

Correctness is expected. Quality is the new currency.

Passing unit tests only proves that your code is a solution to a finite set of requirements. Agents have proved to be exceptional at this and will continue to improve. We believe the new currency is the quality of these solutions. Topos provides the structural evaluations that empower coding agents to find higher quality solutions.

Installation

Get started with the CLI, MCP server, or build from source.

Installation

For Agents

How AI coding agents use Topos to iteratively optimize code and hit quality targets.

For Agents

CLI Reference

Detailed overview of the Topos command-line interface and available tools.

CLI Reference

Preferences

Tell agents how to trade off SIMPLE, COMPOSABLE, and SECURE when GOLD stalls.

Preferences

Measures

A breakdown of the structural and coupling metrics used to evaluate code quality.

Measures

Hint

The scary maths are optional. Topos is grounded in some very abstract fields (category & topos theory). Don’t be alarmed! It’s not required to understand (or appreciate) the maths to evaluate code quality with Topos. We find the formalism elegant, but know this isn’t everyone’s cup of tea. If you’re curious about what we’re building under the hood, check out Concepts.

Beyond Correctness¶

Assume you passed the tests. How good is your solution?

Current code evaluations focus heavily on correctness — does the code pass the unit tests we created? But passing tests doesn’t guarantee that you’ve written good, secure, or maintainable code.

Topos fills this gap by measuring structural quality, ensuring that your code isn’t just correct, but built to last. It provides well-principled evaluations of a programs structure that agents can use to find better solutions.

The Medal Podium¶

Topos measures each file along three independent quality pillars. Each pillar is pass or fail on its own:

SIMPLE — The code avoids unnecessary complexity.
COMPOSABLE — The module is cleanly decoupled from other modules.
SECURE — The code is free of operations that are known to expose security vulnerabilities.

Run topos evaluate or topos inspect on a file; Topos checks all three pillars and awards a Code Quality Medal from how many you pass. Which pillars you pass matters for diagnosis; the medal tier depends only on the count:

Pillars passed	Medal	Example (any combination with this count)
3 of 3	`🥇 GOLD`	SIMPLE + COMPOSABLE + SECURE
2 of 3	`🥈 SILVER`	e.g. SIMPLE + SECURE, or COMPOSABLE + SECURE
1 of 3	`🥉 BRONZE`	e.g. SIMPLE only, or SECURE only
0 of 3	`❌ NONE`	Fails every pillar (or the file could not be parsed)

Manager Priorities & Agent Iteration¶

In a perfect world, every file would earn a 🥇 GOLD medal. In reality, managers and developers have a finite budget of time and tokens.

Topos allows you to set Preferences — an ordering of these medals based on your immediate priorities. Coding agents use this ranking to aim for 🥇 GOLD. If achieving 🥇 GOLD isn’t feasible within the budget, the preference ranking tells the agent exactly how to relax its goals, ensuring it still delivers the highest possible quality medal aligned with your priorities.

Quick look¶

Pick a preference ranking, then let your agent evaluate and iterate on its own output.

topos evaluate src/ -r --preferences simple,composable,secure
topos inspect module.py --preferences simple,composable,secure
topos coverage src/logic.py --tests tests/test_logic.py
topos compare before.py after.py

Each file gets a verdict per quality generator. You always see which generator is the problem, not a single blended number.

How it works¶

Topos measures code along the three independent quality generators and maps them to an 8-element evaluation lattice:

SIMPLE — Built from the abstract syntax tree (AST) and control-flow graph (CFG). We calculate cyclomatic complexity of the CFG and entropy of the AST to assess complexity.
COMPOSABLE — Built from the module dependency graph (MDG) using GitNexus, to capture inter-module dependencies. This is slightly different than the usual program dependence graph (PDG) which is used to capture intra-function dependencies. We calculate Martin Instability and Fanning metrics for the MDG to assess coupling.
SECURE — Built from the code property graph (CPG). We calculate dangerous-API reachability and taint paths from the CPG to assess security.

AST, CFG, PDG, and MDG graph lenses glued over a shared source-coordinate base, amalgamated into a single code property graph. — Each lens reads the same source coordinates; Topos amalgamates them into one code property graph, then measures structure over the unified space.

The Topos quality lattice — SLOP at the bottom, three single-pillar BRONZE states, three two-pillar SILVER states, and IDEAL (GOLD) at the top. — The eight-element evaluation lattice. Climbing the order means satisfying more independent quality generators; GOLD is the meet of all three.

Hint

Three Independent Pillars: SIMPLE, COMPOSABLE, and SECURE are pairwise incomparable. A file can achieve any subset of {S, C, Sc} independently. 🥇 GOLD is the intersection of all three. The Preferences (ranking) determine the order in which an agent traverses through the lattice, attempting to earn the highest possible medal.