refactor: migrate chain-index from QMDB to SQLite#10
Conversation
Replace PersistentChainIndex's QMDB KV storage with a SQLite database.
Key changes:
- Remove evolve_storage dependency from chain-index crate; add rusqlite
with bundled feature to workspace and chain-index Cargo.toml
- Rewrite PersistentChainIndex in src/index.rs:
- No longer generic over S: Storage; owns a Mutex<Connection>
- Constructor: PersistentChainIndex::new(db_path) for on-disk
- Constructor: PersistentChainIndex::in_memory() for tests
- Schema: proper relational tables (blocks, transactions, receipts,
logs, metadata) with indexes; WAL mode for concurrency
- store_block uses a single SQLite transaction for atomicity
- Read methods use SQL queries instead of KV lookups
- Re-indexing same block number correctly deletes old rows first
- Eliminate TX_HASH_CHUNK_SIZE / LOG_CHUNK_SIZE chunking workaround
- Add Sqlite(String) variant + From<rusqlite::Error> to error.rs
- Update tests to use PersistentChainIndex::in_memory() instead of
MockStorage; add tempfile dev-dependency for persistence test
- Update node/src/lib.rs (2 sites): derive chain-index.sqlite path from
data_dir; change type annotations from PersistentChainIndex<S> to
PersistentChainIndex
- Update bin/evd/src/main.rs (1 site): same pattern
- All 13 chain-index tests pass; just quality passes (fmt + clippy)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @tac0turtle, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request fundamentally re-architects the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthroughReplaces the generic Storage-backed chain index with a SQLite-backed PersistentChainIndex, adds rusqlite (v0.38) and related dependencies, updates initialization paths to use a file-based DB (data_dir/chain-index.sqlite) and makes the chain index optional where not required. New SQLite error mapping added. Changes
Sequence DiagramsequenceDiagram
participant App as Application / Node
participant RPC as RPC Server
participant Chain as PersistentChainIndex
participant DB as SQLite DB
App->>Chain: new(db_path) / in_memory()
activate Chain
Chain->>DB: open writer conn & setup read pool
Chain->>DB: init_schema() (if empty)
Chain->>Chain: build cache from metadata
deactivate Chain
App->>RPC: start (requires chain index)
RPC->>Chain: request state / get_block(...)
activate Chain
Chain->>Chain: check cache
alt cache miss
Chain->>DB: SELECT blocks/txs/receipts/logs
DB-->>Chain: rows
Chain->>Chain: map rows -> Stored structs
end
Chain-->>RPC: response
deactivate Chain
App->>Chain: store_block(block) [on execution]
activate Chain
Chain->>DB: BEGIN
Chain->>DB: INSERT/UPDATE blocks, transactions, receipts, logs
Chain->>DB: COMMIT
Chain->>Chain: update cache
deactivate Chain
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
The migration of chain-index from QMDB to SQLite is a significant improvement, providing a proper relational schema and eliminating the need for manual value chunking. The new implementation is cleaner and leverages SQLite's native capabilities for indexing and WAL mode. I have identified a few missing indexes that could lead to performance issues as the chain grows, and I recommend using standard transactions instead of unchecked_transaction for better safety and error reporting.
| FOREIGN KEY (block_number) REFERENCES blocks(number) | ||
| ); | ||
| CREATE INDEX IF NOT EXISTS idx_logs_block ON logs(block_number); | ||
| CREATE INDEX IF NOT EXISTS idx_logs_address ON logs(address); |
There was a problem hiding this comment.
The logs table is missing an index on transaction_hash. This will cause load_logs_for_tx (used by get_receipt) to perform a full table scan, which will become increasingly slow as the database grows.
| CREATE INDEX IF NOT EXISTS idx_logs_address ON logs(address); | |
| CREATE INDEX IF NOT EXISTS idx_logs_address ON logs(address); | |
| CREATE INDEX IF NOT EXISTS idx_logs_tx ON logs(transaction_hash); |
| status INTEGER NOT NULL, | ||
| tx_type INTEGER NOT NULL, | ||
| FOREIGN KEY (block_number) REFERENCES blocks(number) | ||
| ); |
There was a problem hiding this comment.
The receipts table is missing an index on block_number. This will cause the DELETE FROM receipts WHERE block_number = ? operation in store_block to perform a full table scan.
| ); | |
| FOREIGN KEY (block_number) REFERENCES blocks(number) | |
| ); | |
| CREATE INDEX IF NOT EXISTS idx_receipts_block ON receipts(block_number); |
crates/rpc/chain-index/src/index.rs
Outdated
| })?, | ||
| }); | ||
| let conn = self.connection.lock().unwrap(); | ||
| let tx = conn.unchecked_transaction()?; |
There was a problem hiding this comment.
Using unchecked_transaction() is generally discouraged unless you have a specific performance reason to skip the autocommit check. Since the connection is already protected by a Mutex, using the standard transaction() method is safer and provides better error reporting if a transaction is unexpectedly already active.
| let tx = conn.unchecked_transaction()?; | |
| let mut conn = self.connection.lock().unwrap(); | |
| let tx = conn.transaction()?; |
crates/rpc/chain-index/src/index.rs
Outdated
| for receipt in &receipts { | ||
| ops.push(evolve_storage::Operation::Set { | ||
| key: keys::tx_receipt_key(&receipt.transaction_hash), | ||
| value: serde_json::to_vec(receipt)?, | ||
| }); | ||
| } | ||
|
|
||
| // Store logs by block in chunked format to avoid oversize values. | ||
| if !all_logs.is_empty() { | ||
| let log_chunks: Vec<Vec<StoredLog>> = all_logs | ||
| .chunks(LOG_CHUNK_SIZE) | ||
| .map(|c| c.to_vec()) | ||
| .collect(); | ||
| ops.push(evolve_storage::Operation::Set { | ||
| key: keys::logs_by_block_chunk_count_key(block_number), | ||
| value: (log_chunks.len() as u32).to_be_bytes().to_vec(), | ||
| }); | ||
| for (idx, chunk) in log_chunks.iter().enumerate() { | ||
| ops.push(evolve_storage::Operation::Set { | ||
| key: keys::logs_by_block_chunk_key(block_number, idx as u32), | ||
| value: serde_json::to_vec(chunk)?, | ||
| }); | ||
| insert_receipt(&tx, receipt)?; |
There was a problem hiding this comment.
For consistency with how transactions are stored, receipts should also use the loop index as the authoritative transaction_index. Currently, insert_transaction uses the loop index while insert_receipt uses the field from the struct, which could lead to inconsistencies if the input vectors are not perfectly aligned or if the struct field is incorrect.
// Insert receipts and logs
for (idx, receipt) in receipts.iter().enumerate() {
insert_receipt(&tx, receipt, idx as i64)?;
crates/rpc/chain-index/src/index.rs
Outdated
| data: RwLock<HashMap<Vec<u8>, Vec<u8>>>, | ||
| commit_count: RwLock<u64>, | ||
| } | ||
| fn insert_receipt(tx: &rusqlite::Transaction<'_>, receipt: &StoredReceipt) -> ChainIndexResult<()> { |
crates/rpc/chain-index/src/index.rs
Outdated
| VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", | ||
| params![ | ||
| receipt.transaction_hash.as_slice(), | ||
| receipt.transaction_index as i64, |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bin/evd/src/main.rs (1)
17-18:⚠️ Potential issue | 🟡 MinorStale architecture diagram — still references "QMDB Storage" for the chain index.
The ASCII diagram shows the chain index backed by "QMDB Storage", but it now uses a SQLite file (
chain-index.sqlite). QMDB is only the STF storage.📝 Suggested update
-//! │ │ QMDB │ │ -//! │ │Storage │ │ +//! │ │ SQLite │ │ +//! │ │ Index │ │🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/evd/src/main.rs` around lines 17 - 18, The ASCII architecture diagram comment in main.rs still labels the chain index as "QMDB Storage"; update that diagram to reflect the current design by replacing the "QMDB Storage" label with "SQLite" or "chain-index.sqlite" and ensure QMDB is only referenced as the STF storage; locate the ASCII block containing the "Client", "RPC Server", and "QMDB Storage" labels in main.rs and change the label text to accurately state "SQLite (chain-index.sqlite)" or similar.
🧹 Nitpick comments (5)
crates/rpc/chain-index/src/index.rs (4)
481-483:unchecked_transaction— add a safety comment explaining the choice.Without context, this looks like a shortcut. The reason is that
MutexGuard<Connection>only gives&Connection, making the normaltransaction()(which requires&mut Connection) inaccessible;unchecked_transactionis the correct workaround and is safe here becauseconnandtxshare the same stack frame.📝 Suggested comment
- let tx = conn.unchecked_transaction()?; + // `unchecked_transaction` is required here because `conn` is a `MutexGuard<Connection>` + // which only provides `&Connection`. The safety invariant (transaction does not outlive + // the connection) is upheld because both `conn` and `tx` live in this stack frame. + let tx = conn.unchecked_transaction()?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/rpc/chain-index/src/index.rs` around lines 481 - 483, Add a brief safety comment above the unchecked_transaction call explaining why unchecked_transaction is used instead of transaction(): note that conn is a MutexGuard<Connection> (from self.connection.lock().unwrap()) which provides only &Connection so transaction() (which requires &mut Connection) is inaccessible, and that unchecked_transaction is safe here because conn and tx are tied to the same stack frame and the MutexGuard prevents concurrent access; reference the symbols conn, tx, self.connection.lock(), MutexGuard<Connection>, transaction(), and unchecked_transaction in the comment for clarity.
104-181: Missing index onreceipts.block_number— full-table scan during re-indexing.
DELETE FROM receipts WHERE block_number = ?(called instore_blockon every re-indexed block) has no index to cover theblock_numberpredicate. For large receipt tables this degrades to a full scan. All other child tables (transactions,logs) haveidx_tx_block/idx_logs_blockcovering this exact pattern.⚡ Add the missing index
CREATE TABLE IF NOT EXISTS receipts ( ... + FOREIGN KEY (block_number) REFERENCES blocks(number) ); + CREATE INDEX IF NOT EXISTS idx_receipts_block ON receipts(block_number);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/rpc/chain-index/src/index.rs` around lines 104 - 181, The receipts table lacks an index on block_number which causes DELETE FROM receipts WHERE block_number = ? (called in store_block) to do full-table scans; update init_schema to add a CREATE INDEX IF NOT EXISTS idx_receipts_block ON receipts(block_number) inside the SQL batch that creates the receipts table so the delete in store_block is covered by an index.
104-105:lock().unwrap()propagates mutex-poison panics everywhere.
std::sync::Mutexpoisons on thread panic. Any panic while the lock is held (e.g., from aB256::from_slicepanic identified above) renders every subsequentlock().unwrap()call a panic too, making the chain index permanently unusable without a node restart.
parking_lot::Mutexis already a direct dependency and never poisons — replacing the one import is the simplest fix.♻️ Swap to parking_lot::Mutex
-use std::sync::Mutex; +use parking_lot::Mutex;
parking_lot::Mutex::lock()returns the guard directly (noResult), so all call sites simply become:-let conn = self.connection.lock().unwrap(); +let conn = self.connection.lock();Also applies to: 184-184, 314-314, 351-351, 366-366, 386-386, 406-406, 428-428, 452-452, 481-481
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/rpc/chain-index/src/index.rs` around lines 104 - 105, The code currently uses std::sync::Mutex and calls self.connection.lock().unwrap() (e.g., in init_schema and other methods), which will poison on thread panic; replace the std::sync::Mutex type with parking_lot::Mutex and update the import so the field (connection) uses parking_lot::Mutex, then remove the .unwrap() calls and call .lock() directly (parking_lot::Mutex::lock returns the guard), and apply the same swap for all other occurrences listed (lines referencing lock().unwrap() such as in init_schema and the occurrences around the other reported locations). Ensure signatures/types for the connection field and any MutexGuard usages are adjusted to parking_lot equivalents.
365-379:get_block_transactionsis the only read method that never consults the cache.All other hot-path readers (
get_block,get_transaction,get_receipt) return cached results;get_block_transactionsalways acquires the mutex and hits SQLite, even for blocks that were just stored and are fully in-memory. Consider checking the block cache (or a dedicated tx-hashes cache entry) before falling through to the DB.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/rpc/chain-index/src/index.rs` around lines 365 - 379, get_block_transactions always queries SQLite instead of checking the in-memory cache; update the get_block_transactions function to first consult the existing block cache (or a dedicated transaction-hashes cache) for the given block number and return the cached Vec<B256> if present, only acquiring the DB mutex and running the SQL when the cache miss occurs. Locate get_block_transactions and reuse the same cache lookup logic used by get_block/get_transaction/get_receipt (e.g., check the cache map or call the internal cache getter), clone or otherwise return an owned Vec<B256> from the cached entry, and only fall back to the current conn.prepare/query_map path on cache miss, ensuring thread-safety consistent with the other reader methods.Cargo.toml (1)
95-95: Operational note:bundledfeature increases compile time and binary size.
bundledcauses an embedded, up-to-date SQLite to be compiled from source, which avoids system-library drift; the rusqlite docs call it "the right choice for most programs that control their own SQLite databases", while also noting it's not ideal for generic libraries. For this application binary that's the correct call, but CI/CD cold-build times will increase noticeably since it compiles the full SQLite C library.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Cargo.toml` at line 95, The dependency line enables rusqlite with the "bundled" feature which compiles SQLite from source and increases CI cold-build time and binary size; either remove the "bundled" feature from the rusqlite dependency in Cargo.toml to use the system SQLite (rusqlite = "0.32") or make "bundled" an optional feature (add a [features] section, e.g. feature name like "bundled-sqlite" that enables rusqlite = { version = "0.32", features = ["bundled"] }) and update CI/workflows to enable that feature only where you need the bundled build; refer to the rusqlite dependency entry and the Cargo.toml [features] section to implement the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/evd/src/main.rs`:
- Around line 219-228: The PersistentChainIndex::new(...) call and chain_index
initialization should be performed only when the chain index is needed (i.e.,
when config.rpc.enabled or enable_block_indexing is true) to avoid creating the
SQLite file unnecessarily; move the construction and the
chain_index.initialize() call inside the guard that checks config.rpc.enabled ||
enable_block_indexing, make chain_index an Option<Arc<PersistentChainIndex>> (or
otherwise conditionally create it) so downstream code accesses it via if let
Some(chain_index) = chain_index { ... }, and remove the unconditional creation
at the top so the DB file is only created when RPC or block indexing is enabled.
In `@Cargo.toml`:
- Line 95: The Cargo.toml currently pins rusqlite = "0.32" which jumps several
breaking releases; either keep the existing 0.32.x pin or upgrade to 0.38.0
intentionally and then update code to accommodate breaking changes: audit all
uses of rusqlite::Connection::{execute, prepare} to split trailing/multiple
statements, adjust any code relying on u64/usize ToSql/FromSql (enable the
fallible_uint feature if you want automatic impls), remove or replace any use of
the removed release_memory feature, and update hook registration sites to pass
owned Connection where required; also add or adjust Cargo feature flags (e.g.,
fallible_uint) in the rusqlite dependency specification as needed.
In `@crates/rpc/chain-index/src/index.rs`:
- Around line 197-298: The row mappers (row_to_stored_block,
row_to_stored_transaction, row_to_stored_receipt) call B256::from_slice and
Address::from_slice unconditionally and will panic on wrong-length blobs; fix by
validating byte vector lengths before calling those constructors (e.g., ensure
32 bytes for B256 fields and 20 bytes for Address fields), and if a length is
invalid return a rusqlite::Result::Err with a descriptive rusqlite::Error
(propagating a recoverable error instead of panicking) so invalid DB rows are
handled safely.
---
Outside diff comments:
In `@bin/evd/src/main.rs`:
- Around line 17-18: The ASCII architecture diagram comment in main.rs still
labels the chain index as "QMDB Storage"; update that diagram to reflect the
current design by replacing the "QMDB Storage" label with "SQLite" or
"chain-index.sqlite" and ensure QMDB is only referenced as the STF storage;
locate the ASCII block containing the "Client", "RPC Server", and "QMDB Storage"
labels in main.rs and change the label text to accurately state "SQLite
(chain-index.sqlite)" or similar.
---
Nitpick comments:
In `@Cargo.toml`:
- Line 95: The dependency line enables rusqlite with the "bundled" feature which
compiles SQLite from source and increases CI cold-build time and binary size;
either remove the "bundled" feature from the rusqlite dependency in Cargo.toml
to use the system SQLite (rusqlite = "0.32") or make "bundled" an optional
feature (add a [features] section, e.g. feature name like "bundled-sqlite" that
enables rusqlite = { version = "0.32", features = ["bundled"] }) and update
CI/workflows to enable that feature only where you need the bundled build; refer
to the rusqlite dependency entry and the Cargo.toml [features] section to
implement the change.
In `@crates/rpc/chain-index/src/index.rs`:
- Around line 481-483: Add a brief safety comment above the
unchecked_transaction call explaining why unchecked_transaction is used instead
of transaction(): note that conn is a MutexGuard<Connection> (from
self.connection.lock().unwrap()) which provides only &Connection so
transaction() (which requires &mut Connection) is inaccessible, and that
unchecked_transaction is safe here because conn and tx are tied to the same
stack frame and the MutexGuard prevents concurrent access; reference the symbols
conn, tx, self.connection.lock(), MutexGuard<Connection>, transaction(), and
unchecked_transaction in the comment for clarity.
- Around line 104-181: The receipts table lacks an index on block_number which
causes DELETE FROM receipts WHERE block_number = ? (called in store_block) to do
full-table scans; update init_schema to add a CREATE INDEX IF NOT EXISTS
idx_receipts_block ON receipts(block_number) inside the SQL batch that creates
the receipts table so the delete in store_block is covered by an index.
- Around line 104-105: The code currently uses std::sync::Mutex and calls
self.connection.lock().unwrap() (e.g., in init_schema and other methods), which
will poison on thread panic; replace the std::sync::Mutex type with
parking_lot::Mutex and update the import so the field (connection) uses
parking_lot::Mutex, then remove the .unwrap() calls and call .lock() directly
(parking_lot::Mutex::lock returns the guard), and apply the same swap for all
other occurrences listed (lines referencing lock().unwrap() such as in
init_schema and the occurrences around the other reported locations). Ensure
signatures/types for the connection field and any MutexGuard usages are adjusted
to parking_lot equivalents.
- Around line 365-379: get_block_transactions always queries SQLite instead of
checking the in-memory cache; update the get_block_transactions function to
first consult the existing block cache (or a dedicated transaction-hashes cache)
for the given block number and return the cached Vec<B256> if present, only
acquiring the DB mutex and running the SQL when the cache miss occurs. Locate
get_block_transactions and reuse the same cache lookup logic used by
get_block/get_transaction/get_receipt (e.g., check the cache map or call the
internal cache getter), clone or otherwise return an owned Vec<B256> from the
cached entry, and only fall back to the current conn.prepare/query_map path on
cache miss, ensuring thread-safety consistent with the other reader methods.
- Add missing indexes: idx_logs_tx on logs(transaction_hash), idx_receipts_block on receipts(block_number) - Use transaction() instead of unchecked_transaction() for safer transaction handling - Pass array_index to insert_receipt for consistent transaction_index with insert_transaction - Add length-validated b256_from_row/address_from_row helpers to prevent panics on corrupt BLOB data - Conditionally create chain index only when RPC or block indexing is enabled - Upgrade rusqlite from 0.32 to 0.38 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The single Mutex serialized all reads behind the write lock, defeating SQLite WAL mode's concurrent reader capability. Split into: - r2d2 read pool (4 connections): concurrent RPC queries - Dedicated writer Mutex<Connection>: serialized store_block In-memory tests use SQLite shared-cache URI so reader pool and writer see the same data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/rpc/chain-index/src/index.rs (1)
584-611:⚠️ Potential issue | 🟡 MinorCache and DB can disagree on
transaction_indexwhen the input field differs from array position.
insert_transactionandinsert_receiptcorrectly storearray_indexastransaction_indexin the DB (the intended authoritative value), butself.cache.insert_block_with_data(block, transactions, receipts)stores the original structs, which carry whatevertransaction_indexwas set on them before the call.A caller that hits the cache (warm) observes
transaction_index= original field value; after a cache eviction (cold) the DB-loaded struct containstransaction_index= array position. Ifbuild_index_dataever produces transactions/receipts where those two values differ,get_transaction(..).transaction_indexbecomes time-dependent.
get_transaction_locationalways queries the DB and is immune, butget_transaction/get_receiptare not.Consider normalizing the
transaction_indexfield in the structs before inserting them into the cache:♻️ Proposed fix
+ // Normalise transaction_index to array position before caching, matching what + // the DB persists via insert_transaction / insert_receipt. + let transactions: Vec<StoredTransaction> = transactions + .into_iter() + .enumerate() + .map(|(i, mut tx)| { tx.transaction_index = i as u32; tx }) + .collect(); + let receipts: Vec<StoredReceipt> = receipts + .into_iter() + .enumerate() + .map(|(i, mut r)| { r.transaction_index = i as u32; r }) + .collect(); + tx.commit()?; - self.cache.insert_block_with_data(block, transactions, receipts); + self.cache.insert_block_with_data(block, transactions, receipts);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/rpc/chain-index/src/index.rs` around lines 584 - 611, The cache is being populated with the original transaction/receipt structs (via self.cache.insert_block_with_data(block, transactions, receipts)) which can contain a different transaction_index than the authoritative array index stored in the DB by insert_transaction/insert_receipt, causing cache/DB discrepancies for get_transaction/get_receipt; fix by normalizing the transaction_index on the structs before caching: when you enumerate transactions and receipts (the same idx used for insert_transaction/insert_receipt), produce new/updated transaction and receipt objects with transaction_index set to idx as i64 and pass those normalized vectors to self.cache.insert_block_with_data instead of the original transactions/receipts so cached data matches DB.
🧹 Nitpick comments (3)
bin/evd/src/main.rs (2)
319-343:build_index_datais always called even when indexing is disabled.
BlockBuilder::new()...build()andbuild_index_data(...)run unconditionally on every committed block, even whenchain_index_for_callbackisNone(neither RPC nor indexing enabled) orcallback_indexing_enabledisfalse. The computed data is then discarded by theif let Some/if callback_indexing_enabledguards. This wastes CPU on every block execution.♻️ Proposed fix — guard block data construction
- let block = BlockBuilder::<TxContext>::new() - .number(info.height) - .timestamp(info.timestamp) - .transactions(info.transactions) - .build(); - - let (stored_block, stored_txs, stored_receipts) = - build_index_data(&block, &info.block_result, &metadata); - - if let Some(ref chain_index) = chain_index_for_callback { - if callback_indexing_enabled { - if let Err(e) = - chain_index.store_block(stored_block, stored_txs, stored_receipts) - { - tracing::warn!("Failed to index block {}: {:?}", info.height, e); - } else { - tracing::debug!( - "Indexed block {} (hash={}, state_root={})", - info.height, - block_hash, - state_root - ); - } - } - } + if let Some(ref chain_index) = chain_index_for_callback { + if callback_indexing_enabled { + let block = BlockBuilder::<TxContext>::new() + .number(info.height) + .timestamp(info.timestamp) + .transactions(info.transactions) + .build(); + let (stored_block, stored_txs, stored_receipts) = + build_index_data(&block, &info.block_result, &metadata); + if let Err(e) = + chain_index.store_block(stored_block, stored_txs, stored_receipts) + { + tracing::warn!("Failed to index block {}: {:?}", info.height, e); + } else { + tracing::debug!( + "Indexed block {} (hash={}, state_root={})", + info.height, + block_hash, + state_root + ); + } + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/evd/src/main.rs` around lines 319 - 343, The block construction and indexing data creation are done unconditionally; to avoid wasted CPU, only build the Block (via BlockBuilder::<TxContext>::new()...build()) and call build_index_data(...) when indexing is actually enabled: i.e., when chain_index_for_callback.is_some() and callback_indexing_enabled is true. Move the BlockBuilder/Block creation and the subsequent let (stored_block, stored_txs, stored_receipts) = build_index_data(...) inside that same guard, then call chain_index.store_block(...) and keep the existing tracing::warn / tracing::debug paths inside that branch so you don’t reference variables (block, stored_*) outside their scope. Ensure block_hash and state_root used in the debug log are available inside the guarded block or recomputed there.
10-21: Module diagram omits the new SQLite chain-index component.The ASCII-art architecture diagram still only shows the QMDB storage component. Now that a SQLite-backed chain index is a separate subsystem, the diagram is incomplete for new contributors trying to understand the node layout.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/evd/src/main.rs` around lines 10 - 21, The ASCII module diagram in the top-level comment (the large doc-comment block containing the evd/External/Client/QMDB diagram) omits the new SQLite chain-index subsystem; update that comment to add a distinct "SQLite chain-index" box (label it clearly, e.g., "SQLite Chain-Index") alongside or near "QMDB Storage" and show its JSON-RPC/gRPC connections to evd (and/or RPC Server) so the diagram reflects the new subsystem; ensure the new label uses the same ASCII-art style and positioning as existing components so it reads consistently for new contributors.crates/rpc/chain-index/src/index.rs (1)
1306-1322:prop_duplicate_block_storagedoesn't actually test hash replacement.Both
block1andblock2are produced bymake_test_stored_block(block_number), which setshash: B256::repeat_byte(number as u8)— identical for the sameblock_number. Soblock2_hash == block1.hash, andprop_assert_eq!(retrieved.hash, block2_hash)is trivially satisfied regardless of whether the second write replaced the first. The useful part of the test is the transaction count check; the hash check is a no-op.Consider mutating
block2.hashto something distinct (e.g.B256::repeat_byte(0xFFu8 ^ number as u8)) to actually exercise theINSERT OR REPLACE INTO blocksoverwrite path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/rpc/chain-index/src/index.rs` around lines 1306 - 1322, The test prop_duplicate_block_storage never verifies replacement because generate_block_data / make_test_stored_block produce identical hashes for the same block_number; mutate block2.hash to a distinct value before calling index.store_block to exercise the "replace" path (e.g. set block2.hash to a different B256 like B256::repeat_byte(0xFFu8 ^ (block_number as u8)) or any other deterministic alternative), then call index.store_block(block2, txs2.clone(), receipts2) and assert index.get_block(block_number).unwrap().unwrap().hash equals that new value and that index.get_block_transactions still reflects txs2.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/rpc/chain-index/src/index.rs`:
- Around line 584-611: The cache is being populated with the original
transaction/receipt structs (via self.cache.insert_block_with_data(block,
transactions, receipts)) which can contain a different transaction_index than
the authoritative array index stored in the DB by
insert_transaction/insert_receipt, causing cache/DB discrepancies for
get_transaction/get_receipt; fix by normalizing the transaction_index on the
structs before caching: when you enumerate transactions and receipts (the same
idx used for insert_transaction/insert_receipt), produce new/updated transaction
and receipt objects with transaction_index set to idx as i64 and pass those
normalized vectors to self.cache.insert_block_with_data instead of the original
transactions/receipts so cached data matches DB.
---
Duplicate comments:
In `@bin/evd/src/main.rs`:
- Around line 219-233: This change correctly makes chain index creation
conditional (see the chain_index block, PersistentChainIndex::new and
index.initialize usage) so no code changes are required; approve and merge, and
remove the duplicate review metadata/comment tag ([duplicate_comment]) from the
PR comment to avoid clutter.
---
Nitpick comments:
In `@bin/evd/src/main.rs`:
- Around line 319-343: The block construction and indexing data creation are
done unconditionally; to avoid wasted CPU, only build the Block (via
BlockBuilder::<TxContext>::new()...build()) and call build_index_data(...) when
indexing is actually enabled: i.e., when chain_index_for_callback.is_some() and
callback_indexing_enabled is true. Move the BlockBuilder/Block creation and the
subsequent let (stored_block, stored_txs, stored_receipts) =
build_index_data(...) inside that same guard, then call
chain_index.store_block(...) and keep the existing tracing::warn /
tracing::debug paths inside that branch so you don’t reference variables (block,
stored_*) outside their scope. Ensure block_hash and state_root used in the
debug log are available inside the guarded block or recomputed there.
- Around line 10-21: The ASCII module diagram in the top-level comment (the
large doc-comment block containing the evd/External/Client/QMDB diagram) omits
the new SQLite chain-index subsystem; update that comment to add a distinct
"SQLite chain-index" box (label it clearly, e.g., "SQLite Chain-Index")
alongside or near "QMDB Storage" and show its JSON-RPC/gRPC connections to evd
(and/or RPC Server) so the diagram reflects the new subsystem; ensure the new
label uses the same ASCII-art style and positioning as existing components so it
reads consistently for new contributors.
In `@crates/rpc/chain-index/src/index.rs`:
- Around line 1306-1322: The test prop_duplicate_block_storage never verifies
replacement because generate_block_data / make_test_stored_block produce
identical hashes for the same block_number; mutate block2.hash to a distinct
value before calling index.store_block to exercise the "replace" path (e.g. set
block2.hash to a different B256 like B256::repeat_byte(0xFFu8 ^ (block_number as
u8)) or any other deterministic alternative), then call
index.store_block(block2, txs2.clone(), receipts2) and assert
index.get_block(block_number).unwrap().unwrap().hash equals that new value and
that index.get_block_transactions still reflects txs2.
Summary
PersistentChainIndex's QMDB KV backend with SQLite (WAL mode, proper relational schema)PersistentChainIndexis no longer generic overS: Storage— concrete struct withMutex<Connection>TX_HASH_CHUNK_SIZE,LOG_CHUNK_SIZE) eliminated — SQLite handles large values nativelyChainIndextrait interface unchanged — all consumers compile without modificationFiles changed
Cargo.toml— added rusqlite to workspace depscrates/rpc/chain-index/Cargo.toml— added rusqlite (bundled), removed evolve_storagecrates/rpc/chain-index/src/index.rs— full rewrite of PersistentChainIndexcrates/rpc/chain-index/src/error.rs— added Sqlite error variantcrates/app/node/src/lib.rs— updated construction + type annotationsbin/evd/src/main.rs— updated constructionTest plan
just qualitypasses (fmt + clippy zero warnings)just checkpasses workspace-wide🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Chores
Tests