Skip to content

[Feature] #59

@SankarGaneshb

Description

@SankarGaneshb

Problem

In multi‑agent setups using gitagent, some agents act as orchestrators and regularly delegate work to other agents (for example, a planner delegating to a code‑reviewer or a research agent delegating to a summarizer). Today, gitagent standardizes identity, rules, and compliance in git, but there is no git‑native way for agents to track each other’s performance over time. This makes it harder to:

choose between multiple agents that can do the same task

detect degraded or misbehaving agents early

tie governance decisions (like requiring human review) to observed behavior rather than static config.

In practice, this means I’m often “flying blind” about which agent actually performs better, even though the interaction data exists in the runtime.

Proposed Solution

Standardize an optional, git‑native mechanism for Agent KPIs (Key Performance Indicators) driven by peer feedback between agents:

Feedback log artifact

Path: .gitagent/feedback.log (runtime‑generated, append‑only).

Each entry represents one interaction between a caller agent and a callee agent, with a minimal schema like:

timestamp (ISO8601)

caller_agent, callee_agent (name or repo URL)

interaction_id

task_type (e.g., code_review, research, planning)

accuracy_score (numeric)

completion_status (success, partial, failed, escalated)

latency_ms

policy_violations (count).

Aggregated KPI artifact

Path: .gitagent/kpi.json (runtime‑generated summary per agent).

Example shape:

json
{
"agent": "code-reviewer",
"window": "30d",
"metrics": {
"total_calls": 124,
"success_rate": 0.91,
"escalation_rate": 0.05,
"avg_accuracy_score": 0.87,
"avg_latency_ms": 1520,
"violation_rate": 0.02
}
}
Orchestrators and governance layers can read this to route tasks and enforce guardrails.

Optional kpi_policy in agent.yaml

Let agents declare how they want to be monitored:

text
kpi_policy:
metrics:
- name: accuracy_score
min_samples: 20
alert_below: 0.8
- name: escalation_rate
min_samples: 20
alert_above: 0.15
- name: violation_rate
alert_above: 0.05
windows:
default: 30d
actions:
on_alert:
- type: advisory
- type: require_human_review
The spec defines the schema; runtimes decide what “advisory” or “require_human_review” means in practice.

Standard feedback tool definition

A tool such as tools/agent-feedback.yaml that other agents/runtimes can call after delegation, e.g.:

text
name: submit_agent_feedback
description: Submit post-task feedback about another agent for KPI tracking.
input_schema:
type: object
properties:
callee_agent:
type: string
interaction_id:
type: string
task_type:
type: string
accuracy_score:
type: number
completion_status:
type: string
enum: [success, partial, failed, escalated]
latency_ms:
type: integer
policy_violations:
type: integer
required: [callee_agent, interaction_id, completion_status]
The runtime implementation appends to .gitagent/feedback.log and updates .gitagent/kpi.json according to kpi_policy.

This keeps gitagent framework‑agnostic while giving a common, git‑native way to capture and use agent performance data.

Alternatives Considered

Runtime‑only metrics (no git artifacts):
Metrics could live solely in the orchestration layer (Prometheus, logs, etc.), but then they are not part of the agent’s git‑native identity and governance story, and can’t be versioned or audited alongside rules and compliance.

Ad‑hoc, per‑framework feedback formats:
Each framework (LangGraph, CrewAI, custom orchestrators) could invent its own feedback format, but this fragments the ecosystem and makes it harder to reuse agents across runtimes. A minimal standard in gitagent would give everyone a common denominator.

Static labels / manual ratings in agent.yaml:
Manually adding “quality” labels or hand‑curated scores in agent.yaml does not evolve with real‑world behavior and can quickly go stale. It also misses the opportunity to let agents effectively “peer‑review” each other based on actual interactions.

Additional Context

This proposal is meant to be optional and backwards‑compatible: repos can ignore KPIs entirely, and runtimes can choose to implement only parts of it.

I’d be happy to draft:

JSON Schemas under spec/schemas/ for agent-kpi-policy and feedback-event.

A short “Agent KPIs & Peer Feedback” section in docs.md that explains the artifacts and shows how they complement existing Compliance and SOD patterns.

A small example where one agent delegates to another and calls submit_agent_feedback, with sample feedback.log and kpi.json.

Questions for gitagent maintainers:

Does standardizing KPI artifacts like this fit your vision for gitagent’s core spec, or would you prefer it as a documented pattern/extension?

Are .gitagent/feedback.log and .gitagent/kpi.json acceptable paths, or would you prefer a different structure?

Are there existing commands (e.g., gitagent audit) or planned governance features you’d like these KPIs to integrate with?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions