Multi-model routing: triage with cheap models, review with frontier

## Problem

DiffScope uses a single model for the entire review. All three competitors route different tasks to different models:

| Task | Greptile | CodeRabbit | Qodo |
|------|----------|------------|------|
| Triage/classification | — | Cheap model | `model_weak` |
| Summarization | GPT-5-nano / GPT-4o-mini | Light model | `model_weak` |
| Embeddings | text-embedding-3-small | Unknown | Qodo Embed-1 |
| Review | Claude Sonnet 4 | GPT-4/o1/Claude | Primary model |
| Verification | — | Separate agent | — |
| Self-reflection | — | — | `model_reasoning` |
| Complex tasks | Claude Opus 4 | — | — |
| Lightweight | Claude Haiku 4.5 | — | — |

Using frontier models for summarization and triage wastes tokens and money. Using cheap models for review misses bugs.

## Proposed Solution

### Model Hierarchy
```yaml
models:
  primary: claude-sonnet-4-6        # review, verification
  weak: claude-haiku-4-5            # triage, summarization, NL translation
  reasoning: claude-opus-4-6        # complex analysis, self-reflection
  embedding: text-embedding-3-small  # RAG indexing
  fallback:
    - claude-sonnet-4-6
    - gpt-4o
```

### Task Routing
| Task | Model | Rationale |
|------|-------|-----------|
| File triage (NEEDS_REVIEW vs cosmetic) | weak | Simple classification |
| Per-file summarization | weak | Cheap, high-volume |
| NL translation of code chunks | weak | Indexing task |
| Embedding generation | embedding | Specialized |
| Code review | primary | Quality matters |
| Verification pass | primary | Accuracy matters |
| Complex cross-file analysis | reasoning | Needs deep reasoning |
| Commit message generation | weak | Simple task |

### Implementation
```rust
enum ModelRole {
    Primary,
    Weak,
    Reasoning,
    Embedding,
}

impl Config {
    fn model_for_role(&self, role: ModelRole) -> &ModelConfig {
        match role {
            ModelRole::Primary => &self.primary_model,
            ModelRole::Weak => self.weak_model.as_ref().unwrap_or(&self.primary_model),
            ModelRole::Reasoning => self.reasoning_model.as_ref().unwrap_or(&self.primary_model),
            ModelRole::Embedding => self.embedding_model.as_ref().unwrap_or(&self.primary_model),
        }
    }
}
```

### Fallback Chain
Like Qodo: try primary, catch error, try next in fallback list. Simple and robust.

## Expected Impact

- **Cost reduction**: 60-80% for summarization/triage tasks
- **Speed**: Cheap models respond 3-5x faster
- **Quality**: Frontier models focused where they matter (review + verification)
- Greptile reports 75% lower inference costs despite 3x more context tokens, primarily through model routing + prompt caching

## Priority

**Medium — cost optimization + quality improvement.** Becomes critical at scale.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-model routing: triage with cheap models, review with frontier #26

Problem

Proposed Solution

Model Hierarchy

Task Routing

Implementation

Fallback Chain

Expected Impact

Priority

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Task	Greptile	CodeRabbit	Qodo
Triage/classification	—	Cheap model	`model_weak`
Summarization	GPT-5-nano / GPT-4o-mini	Light model	`model_weak`
Embeddings	text-embedding-3-small	Unknown	Qodo Embed-1
Review	Claude Sonnet 4	GPT-4/o1/Claude	Primary model
Verification	—	Separate agent	—
Self-reflection	—	—	`model_reasoning`
Complex tasks	Claude Opus 4	—	—
Lightweight	Claude Haiku 4.5	—	—

Task	Model	Rationale
File triage (NEEDS_REVIEW vs cosmetic)	weak	Simple classification
Per-file summarization	weak	Cheap, high-volume
NL translation of code chunks	weak	Indexing task
Embedding generation	embedding	Specialized
Code review	primary	Quality matters
Verification pass	primary	Accuracy matters
Complex cross-file analysis	reasoning	Needs deep reasoning
Commit message generation	weak	Simple task

Multi-model routing: triage with cheap models, review with frontier #26

Description

Problem

Proposed Solution

Model Hierarchy

Task Routing

Implementation

Fallback Chain

Expected Impact

Priority

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions