topos.graphs.uast.models

UAST Models

Data structures for the Universal Abstract Syntax Tree. These models define the “Normalized” layer of our “Native-first, Normalized-second” architecture.

class topos.graphs.uast.models.SourceSpan(file: 'str | None', start_byte: 'int', end_byte: 'int', start_line: 'int', start_column: 'int', end_line: 'int', end_column: 'int')[source]

Bases: object

file
start_byte
end_byte
start_line
start_column
end_line
end_column
class topos.graphs.uast.models.NativeRef(parser: 'str', parser_version: 'str', node_kind: 'str')[source]

Bases: object

parser
parser_version
node_kind
class topos.graphs.uast.models.UASTNode(kind, lang, span, native, attributes=<factory>, children=<factory>, id='')[source]

Bases: object

Language-normalized node carrying provenance and source spans.

The UASTNode acts as a normalization layer over language-specific Concrete Syntax Trees (CSTs) from Tree-sitter. It maps disparate native nodes into unified kind values that follow the industry-standard reference in docs/uast-industry-standards.md.

While normalized, each node strictly retains its native provenance and span data to ensure fidelity with compiler-native AST expectations (e.g., Python ast, ESTree, Rust syn, Clang).

id is a deterministic 16-hex-char identifier required by the UNodeBase schema for referential integrity (diffs, refactor links, cross-tool references). It is a blake2b hash of (lang, native.node_kind, span.start_byte, span.end_byte, parent_id); chaining the parent’s id encodes the full path from the root, which disambiguates identical-span sibling nodes without needing an explicit sibling index. The mapper walker is responsible for populating it; if a node is constructed directly (e.g. in tests) and no id is supplied, it defaults to the empty string.

kind
lang
span
native
attributes
children
id = ''