[Feature]

## Problem
In multi‑agent setups using gitagent, some agents act as orchestrators and regularly delegate work to other agents (for example, a planner delegating to a code‑reviewer or a research agent delegating to a summarizer). Today, gitagent standardizes identity, rules, and compliance in git, but there is no git‑native way for agents to track each other’s performance over time. This makes it harder to:

choose between multiple agents that can do the same task

detect degraded or misbehaving agents early

tie governance decisions (like requiring human review) to observed behavior rather than static config.

In practice, this means I’m often “flying blind” about which agent actually performs better, even though the interaction data exists in the runtime.

## Proposed Solution
Standardize an optional, git‑native mechanism for Agent KPIs (Key Performance Indicators) driven by peer feedback between agents:

Feedback log artifact

Path: .gitagent/feedback.log (runtime‑generated, append‑only).

Each entry represents one interaction between a caller agent and a callee agent, with a minimal schema like:

timestamp (ISO8601)

caller_agent, callee_agent (name or repo URL)

interaction_id

task_type (e.g., code_review, research, planning)

accuracy_score (numeric)

completion_status (success, partial, failed, escalated)

latency_ms

policy_violations (count).

Aggregated KPI artifact

Path: .gitagent/kpi.json (runtime‑generated summary per agent).

Example shape:

json
{
  "agent": "code-reviewer",
  "window": "30d",
  "metrics": {
    "total_calls": 124,
    "success_rate": 0.91,
    "escalation_rate": 0.05,
    "avg_accuracy_score": 0.87,
    "avg_latency_ms": 1520,
    "violation_rate": 0.02
  }
}
Orchestrators and governance layers can read this to route tasks and enforce guardrails.

Optional kpi_policy in agent.yaml

Let agents declare how they want to be monitored:

text
kpi_policy:
  metrics:
    - name: accuracy_score
      min_samples: 20
      alert_below: 0.8
    - name: escalation_rate
      min_samples: 20
      alert_above: 0.15
    - name: violation_rate
      alert_above: 0.05
  windows:
    default: 30d
  actions:
    on_alert:
      - type: advisory
      - type: require_human_review
The spec defines the schema; runtimes decide what “advisory” or “require_human_review” means in practice.

Standard feedback tool definition

A tool such as tools/agent-feedback.yaml that other agents/runtimes can call after delegation, e.g.:

text
name: submit_agent_feedback
description: Submit post-task feedback about another agent for KPI tracking.
input_schema:
  type: object
  properties:
    callee_agent:
      type: string
    interaction_id:
      type: string
    task_type:
      type: string
    accuracy_score:
      type: number
    completion_status:
      type: string
      enum: [success, partial, failed, escalated]
    latency_ms:
      type: integer
    policy_violations:
      type: integer
  required: [callee_agent, interaction_id, completion_status]
The runtime implementation appends to .gitagent/feedback.log and updates .gitagent/kpi.json according to kpi_policy.

This keeps gitagent framework‑agnostic while giving a common, git‑native way to capture and use agent performance data.

## Alternatives Considered
Runtime‑only metrics (no git artifacts):
Metrics could live solely in the orchestration layer (Prometheus, logs, etc.), but then they are not part of the agent’s git‑native identity and governance story, and can’t be versioned or audited alongside rules and compliance.

Ad‑hoc, per‑framework feedback formats:
Each framework (LangGraph, CrewAI, custom orchestrators) could invent its own feedback format, but this fragments the ecosystem and makes it harder to reuse agents across runtimes. A minimal standard in gitagent would give everyone a common denominator.

Static labels / manual ratings in agent.yaml:
Manually adding “quality” labels or hand‑curated scores in agent.yaml does not evolve with real‑world behavior and can quickly go stale. It also misses the opportunity to let agents effectively “peer‑review” each other based on actual interactions.

## Additional Context
This proposal is meant to be optional and backwards‑compatible: repos can ignore KPIs entirely, and runtimes can choose to implement only parts of it.

I’d be happy to draft:

JSON Schemas under spec/schemas/ for agent-kpi-policy and feedback-event.

A short “Agent KPIs & Peer Feedback” section in docs.md that explains the artifacts and shows how they complement existing Compliance and SOD patterns.

A small example where one agent delegates to another and calls submit_agent_feedback, with sample feedback.log and kpi.json.

## Questions for gitagent maintainers:

Does standardizing KPI artifacts like this fit your vision for gitagent’s core spec, or would you prefer it as a documented pattern/extension?

Are .gitagent/feedback.log and .gitagent/kpi.json acceptable paths, or would you prefer a different structure?

Are there existing commands (e.g., gitagent audit) or planned governance features you’d like these KPIs to integrate with?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] #59

Problem

Proposed Solution

Alternatives Considered

Additional Context

Questions for gitagent maintainers:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] #59

Description

Problem

Proposed Solution

Alternatives Considered

Additional Context

Questions for gitagent maintainers:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions