— project argus — agent factory visualizer

Goal in. Governed agent out.

Four services, one pipeline. Hephaestus forges the agent from a goal spec. Nyx is the blueprint every agent inherits — Strands on AWS AgentCore, runnable locally. Proteus tunes the agent's objective against a dataset, co-creating an adversarial LLM-as-judge when the target cannot be scored with a metric. Panoptes watches everything — one view per run, one view across all experiments.

live factory floor · signals dormant viz · hephaestus → nyx → proteus → panoptes

— 01 · hephaestus

The Forge

factory orchestrator

Turns a GoalSpec into a governed agent. 14 phases with evaluator gates (pass / retry / fail / escalate). Loads the Nyx blueprint, picks a pattern, generates Strands code, hands a genome to Proteus, then packages the winning config for deployment. Drafts the adversarial LLM-as-judge when objectives are fuzzy.

— 02 · nyx

The Blueprint

template + runtime envelope

Every agent inherits Nyx — Strands tools, risk tiers, circuit breakers, gated actions, morning briefings. Targets AWS AgentCore in prod and runs locally — non-negotiable, because Proteus must optimize them on dev loops.

— 03 · proteus

The Tuner

genome → winning config

Searches a genome (models, prompts, retrieval knobs) against a versioned dataset. Bayesian + NSGA-II for multi-objective Pareto trade-offs. When a metric can't score the objective, Proteus co-trains the LLM-as-judge and runs an adversarial loop until the judge is stable and the candidate converges.

— 04 · panoptes

The Hundred Eyes

otel · inventory · compliance

Two views. Single run: an OTel trace of one execution — spans for retrieval, LLM calls, tool invocations, gates, judge verdicts. Inventory: every experiment ever — filter by source (hph / prt / nyx), status, tags, parent. Exports a self-contained audit package for regulators.