Skip to content

feat: File-based checkpoint persistence for connectors (default) #107

@warren-t-c

Description

@warren-t-c

Problem

Connector poll state (cursors, etag, last-seen timestamps, PR cache) is held in memory. On service restart, connectors lose their position and may:

  • Re-process events in the lookback window (causing duplicate route firings)
  • Miss events created between the last poll and the restart
  • Reset the PR cache, causing re-evaluation of unchanged PRs

Proposal

Make file-based checkpoint persistence the default for all connectors.

Requirements

  • Connectors persist their poll checkpoint to a file after each successful poll
  • On startup, connectors read the checkpoint file and resume from where they left off
  • Checkpoint files live in a configurable directory (default: .orgloop/checkpoints/ relative to the module directory)
  • Each source gets its own checkpoint file: <source-id>.json
  • If no checkpoint file exists (first run), connector uses initial_lookback config (existing behavior)
  • File writes are atomic (write to temp, rename) to prevent corruption on crash
  • This should be the default — no config required to enable

Configuration (optional overrides)

defaults:
  checkpoint:
    store: file          # default (was: memory)
    dir: .orgloop/checkpoints

Context

After restarting orgloop to switch repos, the GitHub connector's in-memory checkpoint was lost. Combined with etag caching behavior, this caused new events to not be picked up until the connector's lookback window expired.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions