Skip to content

Arch violation: Auto-lock lifecycle coupled to daemon's shadow session state instead of PID liveness #115

@c-h-

Description

@c-h-

What state is duplicated

Auto-locks are managed by LockManager but their lifecycle is driven entirely by the daemon's session tracking, not by actual process state:

  • session.launchlockManager.autoLock(cwd, session.id) (tied to daemon session ID)
  • session.stoplockManager.autoUnlock(sessionId) (daemon decides session stopped)
  • reconcileAndEnrich() → marks sessions stopped → lockManager.autoUnlock(id) (daemon inferred death)
  • cleanupDeadLaunches() → checks PID liveness in daemon state → lockManager.autoUnlock(id) (daemon state, not adapter)

Files:

  • lock-manager.ts: Lock CRUD
  • session-tracker.ts: Triggers lock cleanup based on daemon session state
  • server.ts: Wires lock cleanup to session lifecycle events
  • state.ts: Locks persisted in locks.json

Where is the ground truth?

A directory is "in use" if and only if a coding agent process is running with that CWD. Ground truth: kill(pid, 0) + lsof for CWD verification.

How does it desync?

  1. Premature unlock: Daemon marks session stopped (via Arch violation: onSessionExit() marks status based on wrapper lifecycle, not process liveness #111 wrapper exit or Arch violation: reconcileAndEnrich() infers stopped status from adapter absence instead of probing truth #113 adapter absence) → lock released → another agent can launch in the same directory while the first is still running → concurrent writes, corruption.
  2. Stuck locks: Pending-* sessions that never resolve (Arch violation: Pending-* ID resolution is daemon-side state tracking that can create ghost sessions #114) have locks that can't be auto-cleaned because the daemon doesn't know their status.
  3. ID mismatch: When pending-* IDs are resolved to real UUIDs, updateAutoLockSessionId must be called — but if the timing is wrong, the lock references a dead ID.

User-visible symptom

  • Lock released while agent still running (potential concurrent directory access)
  • Stuck locks requiring agentctl lock release manual intervention
  • --force required to override locks on pending-* sessions

Proposed fix

Decouple locks from session IDs. Instead:

  1. Lock by PID+CWD: Store { directory, pid, lockedAt } instead of { directory, sessionId }
  2. PID-based cleanup: Check if the locking PID is alive. If dead, release the lock. No need to go through session state.
  3. Adapter-independent: Locks work regardless of whether session-tracker has correct state, pending-* resolution succeeds, etc.

This makes locks self-healing and independent of the session tracking layer's correctness.

Related: #110, #111, #113, #114

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions