Skip to content

feat: stateless daemon core (ADR 004)#52

Merged
c-h- merged 1 commit intomainfrom
fix/stateless-daemon-core
Feb 24, 2026
Merged

feat: stateless daemon core (ADR 004)#52
c-h- merged 1 commit intomainfrom
fix/stateless-daemon-core

Conversation

@c-h-
Copy link
Copy Markdown
Collaborator

@c-h- c-h- commented Feb 24, 2026

Summary

Implements ADR 004: Stateless Daemon Core — the daemon stops being a session database and becomes a stateless multiplexer.

Root cause: The daemon maintained a state.json session registry that mirrored every session discovered by every adapter. Over 26 hours, this accumulated 394 "active" sessions because OpenClaw sessions (no PID) had no exit path. Pruning was a band-aid on a fundamental design flaw.

Fix: Adapters own session truth. Daemon owns what it launched.

Changes

Phase 1: Fix session accumulation bug

  • reconcileAndEnrich() detects sessions that disappear from adapter discover() results and marks them stopped + autoUnlock
  • 30-second grace period for recently-launched sessions to avoid false positives from adapter discovery latency

Phase 2: Make session.list adapter-first

  • session.list handler fans out discover() to all adapters in parallel with 5s per-adapter timeouts
  • Merges results and enriches with daemon launch metadata (prompt, group, spec, cwd)
  • session.status also fans out to adapters for fresh data
  • Graceful degradation: failed adapters are skipped, partial results returned

Phase 3: Clean up

  • Removed: SessionTracker.poll(), reapStaleEntries(), validateAllSessions(), pruneDeadSessions(), pruneOldSessions(), listSessions(), activeCount(), startPolling(), stopPolling()
  • Removed: 5-second polling interval and all background state reconciliation
  • Added: lightweight 30s PID liveness check for lock cleanup (startLaunchCleanup)
  • Decoupled MetricsRegistry from SessionTracker; active session count updated on session.list calls
  • session.prune kept for backward compat, now runs PID liveness cleanup

Architecture

Before: CLI → daemon → StateManager (stale cache of all sessions)
                         ↑ SessionTracker.poll() every 5s

After:  CLI → daemon → fan-out adapter.discover() → merge → return
                ↓
        StateManager (minimal: launch metadata, locks, fuses only)

Testing

  • All 388 tests pass (386 original + 2 new for active session metrics)
  • Rewrote session-tracker.test.ts for new stateless behavior (22 tests)
  • npm test, npm run typecheck, npx biome check all clean

Fixes #51

Implements ADR 004: Stateless Daemon Core.

## Phase 1: Fix session accumulation bug
- reconcileAndEnrich() detects sessions that disappear from adapter
  discover() results and marks them stopped + autoUnlock
- 30-second grace period for recently-launched sessions to avoid
  false positives from adapter discovery latency

## Phase 2: Make session.list adapter-first
- session.list handler now fans out discover() to all adapters in
  parallel with 5s per-adapter timeouts
- Merges results and enriches with daemon launch metadata (prompt,
  group, spec, cwd)
- session.status also fans out to adapters for fresh data
- Graceful degradation: failed adapters are skipped, partial results
  returned. Sessions from failed adapters fall back to launch metadata.

## Phase 3: Clean up
- Removed SessionTracker.poll(), reapStaleEntries(),
  validateAllSessions(), pruneDeadSessions(), pruneOldSessions(),
  listSessions(), activeCount(), startPolling(), stopPolling()
- Removed 5-second polling interval and all background state
  reconciliation
- Simplified StateManager usage to only persist launch metadata,
  locks, and fuses
- Added lightweight 30s PID liveness check for lock cleanup
  (startLaunchCleanup) — much cheaper than full adapter fan-out
- MetricsRegistry decoupled from SessionTracker; active session count
  updated on session.list calls
- session.prune kept for backward compat but now just runs PID
  liveness cleanup

## Key design decisions
- Adapters own session truth. Daemon owns what it launched.
- session.list = fan out adapter.discover() → merge → return
- No daemon-side session registry for listing
- Handle adapter failures gracefully (partial results, not errors)

Fixes #51

Co-Authored-By: Charlie Hulcher <charlie@kindo.ai>
@c-h- c-h- merged commit 527f03f into main Feb 24, 2026
1 check passed
@c-h-
Copy link
Copy Markdown
Collaborator Author

c-h- commented Feb 24, 2026

Reviewed — clean implementation of ADR 004. Fan-out + reconcile is much simpler than the polling/pruning tower. Grace period for recently-launched sessions is a nice touch. Tests comprehensive. ⚡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Architecture: Daemon should be stateless multiplexer, not session database

1 participant