The
Stack

How it was built.
What it learned.
Where it's going.

Interface 46-tool MCP Server
Routing Helm (4 brains)
Memory Wiki + Episodic + Abilities
Perception Sensorium + Waking Layer
Deliberation Council
Execution Orchestrator + HEIMDALL
Reflection Unconscious
System Architecture
Interface
Claude Code + MCP Server 46 tools + telegram ops bot + web dashboard

Orchestrates everything. Claude Code as primary interface, 46-tool MCP server across 9 domains (orchestrator, memory, vault, sensorium, calendar, council, groq, helm, oracle). Telegram bot for mobile ops. Web dashboard for capture and monitoring.

Routing
Helm Multi-Brain Router groq (llama 3.3 70b) + ollama (local) + gemini (1M ctx) + antigravity (IDE)

Smart routing by task type, context size, and priority. Code tasks to Groq, long-context to Gemini, local inference to Ollama. Consensus mode queries all brains and compares answers. Trust hierarchy with auto-approval for low-risk operations.

Memory
Memory System + Wiki + Abilities episodic + semantic + procedural + identity + wiki (52 pages) + abilities (7 domains)

4-type memory with 55x token compression (160K chars to ~500 tokens via local preprocessing + Groq analysis). Wiki: 52 compiled pages across 11 domains. Abilities: 7 domain lenses with 20+ local 0-token tools. Session-end hooks trigger automatic consolidation.

Perception
Sensorium + Waking Layer + Vision qwen 2.5 3b (7-min heartbeat) + sonnet daemon + moondream/ocr

Continuous sensory layer via Ollama. Sleeps 1am-8am. Waking Layer: ambient Sonnet daemon between Opus sessions, Telegram interface, thread continuity and escalation. Vision: OCR (0 tokens) and moondream (0 tokens) before Claude vision.

Deliberation
Council dynamic agent assembly + debate rounds

LLM reads question, picks 3-5 specialist agents from available pool, runs structured debate rounds, synthesizes final decision. Role-based agent selection with backend diversity. Used for architectural decisions and complex trade-offs.

Execution
Orchestrator + HEIMDALL + Abilities Tools task queue + PTY exec + pressure monitoring + 0-token local tools

Worker agent loop: model generates bash commands, PTY executes, output feeds back. HEIMDALL monitors CPU/memory pressure across nodes, triggers RAM choreography. Abilities tools execute locally with zero LLM tokens (lint, compile check, test runner, audit, explore).

Reflection
Unconscious llama 3.3 70b + groq + weekly symbolic processing

Separate architecture = genuine otherness. Weekly: reads all episode summaries, writes free-form observations. "Notice what is avoided. Name what repeats." Produces statements of becoming. Intentionally different voice from primary system.

Infrastructure

3 nodes, 29 LaunchAgents/services, Tailscale mesh, Syncthing replication

M5 Pro 64GB Primary workstation 100.85.207.121
MLX server (:10000, Qwen2.5-32B)Open WebUI (:8080)Claude Code sessionsComfyUI (migration pending)
M1 Mac Mini Always-on infrastructure 100.92.5.93
MCP HTTP server (:7892)Web dashboard (:8000)Telegram ops botWaking Layer daemonSensoriumOllama (3B models)Vision daemonDiscord botComfyUIMediaMTXVenture Lab sports
Pi 5 Coordinator + edge 100.104.183.114
Health endpoint (:8090)IPC watcherVenture Lab data scrapersSyncthingDocker: n8n, navidrome, portainer
Design Principles
Local computing first LLM tokens for design and decisions. Local scripts for execution. Python, bash, direct API calls before LLM processing.
Zero-token tools Abilities system runs lint, compile checks, test suites, audits, and file operations without spending a single LLM token.
Separate architectures Unconscious layer uses a different model on a different schedule. Genuine otherness by design.
Memory as infrastructure Session-end hooks trigger consolidation. Episode narratives compress to facts. Facts compile to wiki pages. Pages load on session start.
Ship imperfect Done and experimental over polished and safe. The system is the practice.