-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Problem
When OpenCode hangs (empty provider response, Ohm circuit breaker stuck), the wrapper shell stays alive but the inner process is stuck in kevent64 doing nothing. agentctl reports the session as 'running' because the PID exists. No completion webhook fires. The supervisor never triggers. The PR is abandoned.
Proposed Solution
1. Configurable max session runtime
--timeout <seconds> flag. After the timeout, agentctl kills the session and fires the completion webhook with a timeout status.
2. Output-based health probe
If a session produces no output for >N minutes (configurable, default 15min), consider it stalled. Fire a warning event or kill+callback.
3. PID vs process health
Don't just check if the wrapper PID exists — check if the inner OpenCode process has open network connections or has produced output recently.
Impact
This is the backstop for all silent agent failures. Without it, any provider outage or model error causes sessions to hang forever.