Quickstart

Get from zero to insights in under 10 minutes.

uv pip install pulsar

Prerequisites

  • Python 3.10+

  • For development: Rust toolchain

Option 1: Use a Pre-Built Demo (Fastest)

The fastest way to see Pulsar in action:

# Run the penguins demo (no data download needed)
cd /path/to/pulsar
uv sync
uv run maturin develop --release
python -c "
from pulsar.pipeline import ThemaRS
config = {'run': {'name': 'penguins', 'data': 'demos/penguins/penguins.csv'}}
model = ThemaRS.from_dict(config)
model.fit()
print(f'Cosmic graph: {len(model.cosmic_graph.nodes())} nodes, {len(model.cosmic_graph.edges())} edges')
"

Done! You’ve discovered penguin species structure without looking at species labels.

For all demos: Demos

Option 2: Use with Claude AI (No Code)

Let Claude handle the analysis:

  1. Set up Pulsar MCP server (see MCP Server)

  2. Open Claude Desktop

  3. Paste: “Analyze the file at ``demos/penguins/penguins.csv`` using Pulsar. Find the hidden structure.”

Claude will orchestrate parameter tuning and generate a statistical dossier.

Option 4: Programmatic Configuration (Full Control)

For maximum control, configure directly in Python:

from pulsar import ThemaRS

model = ThemaRS(
    data="data.csv",
    pca_dims=[2, 5, 10],
    epsilon_range=(0.1, 0.5, 5),
    random_state=42,
)
model.fit()

Understanding the Pipeline

Pulsar executes these stages:

  1. Impute: Fill missing values in specified columns

  2. Scale: StandardScaler normalization

  3. PCA sweep: Project data to multiple dimensions

  4. Ball Mapper sweep: Build neighborhood graphs at multiple epsilon values

  5. Pseudo-Laplacians: Compute graph Laplacians for each configuration

  6. Cosmic graph: Aggregate into a weighted similarity graph

  7. Selection: Choose representative configurations via graph distances

# Access intermediate results
print(f"PCA configurations: {len(model.pca_results_)}")
print(f"Ball Mapper graphs: {len(model.ball_mapper_graphs_)}")
print(f"Weighted adjacency shape: {model.weighted_adjacency_.shape}")

Performance Tips

Pulsar’s Rust core provides significant speedups. For large datasets:

# Reduce sweep resolution for faster iteration
model = ThemaRS(
    data="large_data.csv",
    pca_dims=[5],           # Single dimension
    epsilon_range=(0.2, 0.4, 3),  # Fewer epsilon steps
)

Next Steps