Skip to content

DevRev Search Challenge Submission #4

Open
shrey2003 wants to merge 9 commits intodevrev:mainfrom
shrey2003:main
Open

DevRev Search Challenge Submission #4
shrey2003 wants to merge 9 commits intodevrev:mainfrom
shrey2003:main

Conversation

@shrey2003
Copy link
Copy Markdown

@shrey2003 shrey2003 commented Mar 11, 2026

I built a custom hybrid retrieval pipeline using SOTA models rather than the baseline FAISS approach.

System Details:
System Description: Hybrid Search pipeline combining dense embeddings (VoyageAI voyage-4-large 2048-dim) via Alibaba's Zvec database, with sparse lexical retrieval (BM25), fused together and passed through a cross-encoder reranker (VoyageAI Rerank-2.5).
System Type: Hybrid / RAG Retriever
Open Source: Not open source (rerankers and the embedding models are closed source, vector db and sparse retrieval is open source)

Looking forward to seeing the results on the leaderboard!

https://app.devrev.ai/devrev/works/ISS-269621/
ISS-269621

These are the details for my current pipeline 

System Description: Hybrid Search pipeline combining dense embeddings (VoyageAI voyage-4-large 2048-dim) via Alibaba's Zvec database, with sparse lexical retrieval (BM25), fused together and passed through a cross-encoder reranker (VoyageAI Rerank-2.5).
System Type: Hybrid / RAG Retriever
Open Source: Not open source (rerankers and the embedding models are closed source, vector db and sparse retrieval is open source)
@shrey2003
Copy link
Copy Markdown
Author

@nimit2801 @prakhar7651 The description validation workflow is failing I am unable to find any suitable template in the readme. Do i have to follow any official template for the submission?

@nimit2801
Copy link
Copy Markdown
Collaborator

hey @shrey2003

Kindly add this in your PR description: https://app.devrev.ai/devrev/works/ISS-269621

@shrey2003 shrey2003 changed the title DevRev Search Challenge Submission DevRev Search Challenge Submission Mar 12, 2026
@prakhar7651
Copy link
Copy Markdown
Contributor

Hey Shreya!
Thanks for the submission, you results look promising on our benchmarks!
Can you confirm your topK is set to something greater than 50? We want to evaluate recall@k for various k values.

@shrey2003
Copy link
Copy Markdown
Author

Hey @prakhar7651 thanks for your evaluation!
The top k was set to 10 currently for generating the test queries results. Let me know if you have any further questions
image

@shrey2003
Copy link
Copy Markdown
Author

shrey2003 commented Mar 26, 2026

As per the instructions this json contains my results: test_queries_results.json @prakhar7651 @nimit2801

@prakhar7651
Copy link
Copy Markdown
Contributor

Hey!
These are your scores.
Recall@10: 0.2771
Precision@10: 0.3076

@shrey2003
Copy link
Copy Markdown
Author

@prakhar7651 Thanks for the evaluation. I have tried improving my script and rerun the results there were few bugs affecting he results which i found out. Can I resubmit after improving?

@prakhar7651
Copy link
Copy Markdown
Contributor

Yes, you can. Let me know when you're done and tell me which file to evaluate.

@shrey2003
Copy link
Copy Markdown
Author

test_queries_results_new.json @prakhar7651 this is the corrected file. Please evaluate it. Thanks!

@prakhar7651
Copy link
Copy Markdown
Contributor

In this commit - 41725c563d3f5690747ed7d57d88a579405efd57, you didn't make any changes in the notebook and submitted a new file. Is this expected?

@shrey2003
Copy link
Copy Markdown
Author

shrey2003 commented Mar 31, 2026

@prakhar7651 Yes I have run the results with a corrected python script I will update the notebook if the results are better that's why I didn't upload it can you check how is the new result performing?

@shrey2003
Copy link
Copy Markdown
Author

shrey2003 commented Mar 31, 2026

@prakhar7651 can you evaluate this now as today I think is the last day for submission?

@shrey2003
Copy link
Copy Markdown
Author

@nimit2801

@prateekjain2606
Copy link
Copy Markdown

@shrey2003 can you please update your code before we release the results for latest submission so we can verify the code and ensure the results are reproducible

@shrey2003
Copy link
Copy Markdown
Author

@prateekjain2606 I have already pushed run_submission.py file please check

@shrey2003
Copy link
Copy Markdown
Author

shrey2003 commented Apr 1, 2026

@prakhar7651 Can you evaluate this too I have already added my submission file and code here
and if you can please tell me the scores of both of these files now I just wanted to check if their is any improvement
test_queries_results_new.json (This is with my updated method) and test_queries_results.json

@prakhar7651
Copy link
Copy Markdown
Contributor

Here are your updated scores,

Recall: 0.4497
Precision: 0.2935

The previous scores which we posted for your old submission, those were incorrect (some error on our side), these are the true values for your old submission

Recall: 0.1822
Precision: 0.1315

@prakhar7651
Copy link
Copy Markdown
Contributor

Looking at the quality of submissions and eagerness for folks to contribute, we're extending the deadline to April 7th. Evaluations would be still going on. Please keep contributing.

@shrey2003
Copy link
Copy Markdown
Author

shrey2003 commented Apr 1, 2026

@prakhar7651 Now this seems good! I was pretty much amazed earlier to see it perform so badly as I have almost compared from every benchmark and my tests that voyage has the best reranker and embedding models out their, no one can compete with them there was some issue in my earlier code Will try more combinations and check out if I can improve the scores further

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants