-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Context
Per the stateless daemon architecture (ADR 004, PR #52, PR #117): agentctl is a stateless orchestration layer. Ground truth lives in adapters. No shadow state.
daemon-env.json is shadow state — a point-in-time snapshot of the shell environment taken at daemon start. It goes stale, it gets corrupted (currently 0 bytes or missing critical keys), and it creates a mystery gap between the user's actual environment and what child processes receive.
The Problem
buildSpawnEnv() loads from daemon-env.json. If that file is stale/empty/corrupt, child processes (Claude Code, OpenCode, etc.) launch without critical env vars like ANTHROPIC_API_KEY, OPENAI_API_KEY. This causes silent 502 failures.
Fix: Delete the Snapshot, Derive at Spawn Time
Following the same inversion principle as PR #52/#117:
Before: daemon start → snapshot env → save JSON → load JSON at spawn → pray it's still valid
After: spawn time → source ~/.zshenv + process.env → done
Implementation
- Delete
saveEnvironment()fromdaemon/server.ts— no more env snapshot at daemon start - Rewrite
buildSpawnEnv()to derive env at spawn time:- Start with
process.env(whatever the daemon/CLI has) - Source
~/.zshenvviachild_process.execSync('zsh -c "source ~/.zshenv && env"')and parse the output ~/.zshenvis the POSIX-standard file for non-interactive env vars — it's where API keys belong- Apply
opts.envoverrides (per-launch extras, same as today) - ~10ms overhead per launch, negligible
- Start with
- Delete
daemon-env.json— it's shadow state, same asstate.jsonsessions were - Delete
loadSavedEnvironment()— no more loading stale snapshots - Keep
getCommonBinDirs()— PATH augmentation is still useful
Benefits
- API key rotation takes effect immediately (no daemon restart needed)
- No stale snapshot corruption
- No 0-byte file failures
- Daemon is more stateless (fewer files to manage)
- Follows ADR 004 principle: derive from ground truth, don't cache
Note on Future Key Governance
The current Ohm architecture (V4 spec) envisions a Control Plane → Data Plane push model where API keys are part of pushed TenantContexts, not local env vars. The concept of 'server keys taking precedence over client keys' will eventually be replaced by the Control Plane pushing the authoritative key configuration. This ~/.zshenv approach is correct for the current phase (self-managed, local proxy) and will naturally be superseded when the Control Plane exists.
Testing
- Verify Claude Code launches successfully with API keys from
~/.zshenv - Verify OpenCode launches successfully
- Verify
opts.envoverrides still work - Verify PATH augmentation still works
- Verify graceful behavior when
~/.zshenvdoesn't exist (fall back toprocess.envonly)