Implement hybrid retrieval pipeline with Snowflake Arctic embeddings + BM25 + reranker by chaman56 · Pull Request #13 · devrev/devrev-search-bench

chaman56 · 2026-03-31T18:21:20Z

Overview

Optimized hybrid retrieval pipeline for DevRev Search challange.

Pipeline: Dense (Snowflake Arctic Embed L v2.0) + BM25 → Weighted RRF (BM25 2x weight) → Article-sibling expansion →
BGE-reranker-v2-m3 (top-150 candidates, 768-char context)

Results (291 annotated queries):

R@10	P@10	R@50	P@50	MRR@10
22.27%	15.70%	32.99%	6.05%	0.4378

Submission name: snowflake-arctic-embed-l-v2.0-bm25-bge-reranker-v2-m3

Ablation Results (291 queries)

Method	R@10	P@10	R@50	P@50	MRR@10
Dense Only	4.85%	3.92%	9.16%	2.48%	0.096
BM25 Only	14.63%	10.17%	28.38%	4.65%	0.276
Dense + BM25 (RRF)	12.56%	9.45%	27.67%	4.77%	0.233
Dense + BM25 + Reranker	20.66%	14.81%	31.23%	5.88%	0.427

Systems Used

Component	Type	Open Source
Snowflake Arctic Embed L v2.0	Dense embedding (1024d)	Yes, Apache 2.0
BM25 Okapi	Sparse retrieval	Yes, Apache 2.0
FAISS	Vector index	Yes, MIT
BGE-reranker-v2-m3	Cross-encoder reranker	Yes, Apache 2.0

work-item: ISS-1

- snowflake_embedding.py: Embedding pipeline using snowflake-arctic-embed-l-v2.0 - hybrid_retrieval.ipynb: Dense + BM25 + RRF + reranker with ablation study - optimized_retrieval.ipynb: Weighted RRF, sibling expansion, tuned params - ablation_metrics.json: Evaluation results on 291 annotated queries - Updated .gitignore for large binary files

chaman56 · 2026-03-31T18:42:32Z

@nimit2801 @prakhar7651 , Made the submission, please help with evaluations, file to be evaluated: test_queries_results.json

prakhar7651 · 2026-04-01T17:33:07Z

Hey!
There were some errors in the evaluation pipeline. These are your scores.
Recall: 0.2513
Precision: 0.2000

prakhar7651 · 2026-04-01T17:46:23Z

Looking at the quality of submissions and eagerness for folks to contribute, we're extending the deadline to April 7th. Evaluations would be still going on. Please keep contributing.

chaman56 self-assigned this Mar 31, 2026

chaman56 requested a review from nimit2801 March 31, 2026 18:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement hybrid retrieval pipeline with Snowflake Arctic embeddings + BM25 + reranker #13

Implement hybrid retrieval pipeline with Snowflake Arctic embeddings + BM25 + reranker #13
chaman56 wants to merge 1 commit intodevrev:mainfrom
chaman56:main

chaman56 commented Mar 31, 2026 •

edited

Loading

Uh oh!

chaman56 commented Mar 31, 2026

Uh oh!

prakhar7651 commented Apr 1, 2026

Uh oh!

prakhar7651 commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chaman56 commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Ablation Results (291 queries)

Systems Used

Uh oh!

chaman56 commented Mar 31, 2026

Uh oh!

prakhar7651 commented Apr 1, 2026

Uh oh!

prakhar7651 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chaman56 commented Mar 31, 2026 •

edited

Loading

prakhar7651 commented Apr 1, 2026 •

edited

Loading