Semantic code search using local embeddings. Find functions, classes, and methods by describing what they do in natural language.
code_search enables semantic search over codebases by:
- Indexing code signatures (functions, classes, methods) using the
code_skimAST parser - Generating embeddings locally, currently uses the
all-MiniLM-L6-v2model via hugot - Storing vectors in chromem-go with persistence
- Searching by natural language query
EXPERIMENTAL - This tool is in early development and may have limitations or bugs.
🔒 Disabled by default - Enable with ENABLE_ADDITIONAL_TOOLS=code_search
On first use, the tool downloads:
- Embedding model (~90MB) from Hugging Face to
~/.mcp-devtools/models/
The binary size increase is minimal (~6MB) as heavy dependencies are downloaded on demand.
Index a codebase for semantic search. May take a few minutes for large codebases depending on your hardware.
When indexing starts, a notifications/message notification is sent to the client to inform the user that indexing is in progress.
{
"action": "index",
"source": ["/path/to/project"]
}Response:
{
"indexed_files": 298,
"indexed_items": 1217
}Find code by natural language description.
{
"action": "search",
"query": "function that handles HTTP requests"
}Response:
{
"results": [
{
"path": "/project/internal/oauth/validation/validator.go",
"name": "isLocalhostRequest",
"type": "function",
"signature": "func isLocalhostRequest(r *http.Request) bool",
"similarity": 0.47,
"line": 272
},
{
"path": "/project/internal/oauth/server/server.go",
"name": "RequireScope",
"type": "function",
"signature": "func RequireScope(scope string) func(http.Handler) http.Handler",
"similarity": 0.46,
"line": 232
},
{
"path": "/project/internal/tools/packageversions/utils.go",
"name": "MakeRequest",
"type": "function",
"signature": "func MakeRequest(client HTTPClient, method, url string, headers map[string]string) ([]byte, error)",
"similarity": 0.44,
"line": 99
}
],
"last_indexed": "2025-12-17 06:30"
}When results are truncated:
{
"results": [...],
"total_matches": 43,
"limit_applied": 10,
"last_indexed": "2025-12-17 06:30"
}Check index status.
{
"action": "status"
}Response:
{
"indexed": true,
"total_files": 251,
"total_items": 1217,
"model_loaded": true,
"runtime_loaded": true,
"runtime_version": "gomlx"
}Clear the index (optionally for specific paths).
{
"action": "clear",
"source": ["/path/to/project"]
}| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
action |
string | Yes | - | One of: search, index, status, clear |
source |
array | For index | - | Paths to index or filter (recursively walks directories) |
query |
string | For search | - | Natural language search query |
limit |
number | No | 10 | Maximum results to return |
threshold |
number | No | 0.3 | Minimum similarity score (0-1) |
The tool matches natural language descriptions to function signatures semantically:
| Query | Typical matches |
|---|---|
"function that handles HTTP requests" |
MakeRequest, isLocalhostRequest, HTTP middleware |
"parse JSON configuration" |
Config parsers, JSON unmarshalers |
"validate user input" |
Input validators, sanitisers |
"retry with backoff" |
Retry helpers, exponential backoff implementations |
"create database connection" |
DB initialisers, connection pool factories |
"hash password" |
Password hashers, bcrypt wrappers |
"send email notification" |
Email senders, notification handlers |
Tips for effective queries:
- Describe what the code does, not what it's called
- Include domain terms (e.g., "HTTP", "database", "authentication")
- Be specific about the operation (e.g., "validate" vs "process")
Indexes the same languages as code_skim:
- Go, Python, JavaScript/TypeScript, Rust, Java, Swift, C/C++
- Index location:
~/.mcp-devtools/embeddings/ - Model location:
~/.mcp-devtools/models/ - File tracking:
~/.mcp-devtools/embeddings/file_tracker.json - Incremental indexing: skips already-indexed files
By default, the index is not automatically updated when files change. To enable automatic reindexing of modified files before each search, set the CODE_SEARCH_STALE_THRESHOLD environment variable:
CODE_SEARCH_STALE_THRESHOLD=30s # Reindex files modified more than 30 seconds ago
CODE_SEARCH_STALE_THRESHOLD=1m # Reindex files modified more than 1 minute ago
CODE_SEARCH_STALE_THRESHOLD=5m # Reindex files modified more than 5 minutes agoHow it works:
- Before each search, checks if any indexed files have been modified since indexing
- If a file was modified AND the modification is older than the threshold, it's reindexed
- The threshold prevents reindexing files during active editing sessions
- Reindexing happens transparently - the search response is unchanged
When disabled (default): No automatic reindexing occurs. Use clear + index to manually refresh the index.
# 1. Index your project
{"action": "index", "source": ["/path/to/myproject"]}
# 2. Search for relevant code
{"action": "search", "query": "parse JSON configuration file"}
# 3. Check what's indexed
{"action": "status"}
# 4. Re-index after changes (clears and rebuilds)
{"action": "clear"}
{"action": "index", "source": ["/path/to/myproject"]}- Discovering unfamiliar codebases
- Finding code when you know what it does but not what it's called
- Exploring implementations of concepts (e.g., "retry with exponential backoff")
- Exact name lookups: use
greporglobinstead - Finding specific syntax: use regex search
- Small codebases where manual exploration is faster
Requires CGO and is available on:
- macOS (darwin)
- Linux (amd64)