Simplify block production (leanSpec PR #464)#251
Conversation
…leanSpec PR #464) Rewrite build_block() to filter and select proofs directly from a payload map keyed by data_root, eliminating the round-trip of extracting individual attestations from aggregated payloads, filtering them, then re-aggregating. This matches the simplified approach in leanSpec. Key changes: - Add PayloadBuffer::grouped_by_data_root() for proof deduplication - Replace aggregate_attestations_by_data() + select_aggregated_proofs() with extend_proofs_greedily() greedy set-cover selection - Simplify produce_block_with_signatures() to resolve AttestationData per data_root instead of reconstructing per-validator attestations - Route own aggregation output directly to known_payloads (immediately usable for block building) instead of going through new -> known
🤖 Codex Code Review
I did not run the test suite: Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
🤖 Claude Code ReviewHere is the review: Review: Simplify block production (leanSpec PR #464)The refactoring is well-motivated and the logic is substantially cleaner. The core ideas are sound. A few observations below. Non-determinism in
|
🤖 Kimi Code ReviewFile: Lines 948–1004 (
Lines 1009–1100 (
Lines 206–209 (
Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt |
🤖 Claude Code ReviewNow I have enough context for a complete review. Here is my assessment: Review: Simplify block production (leanSpec PR #464)The refactoring is well-motivated and cleaner overall. A few correctness and determinism concerns worth addressing before merge. Correctness: same validator may appear in multiple data roots
The old path called In practice, In let votes = justifications.entry(target.root).or_insert_with(|| ...);
votes[validator_id] = true;A validator in two included attestations with different targets would contribute to both targets' vote counts — which is equivocation. Whether leanSpec PR #464 explicitly permits this needs clarification. If not, the fix is to maintain a global Non-determinism in
|
🤖 Codex Code ReviewFindings
Testing: I could not run Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
Greptile SummaryThis PR implements leanSpec #464, replacing the old Confidence Score: 4/5Safe to merge; no correctness or security issues found — all flagged items are non-blocking P2 efficiency/reproducibility suggestions. The refactoring is logically equivalent to the original for correct inputs, all 107 tests pass, wire format and public API are unchanged. The three comments (redundant state transition, non-deterministic sort secondary key, stale Minor attention to
|
| Filename | Overview |
|---|---|
| crates/blockchain/src/store.rs | Core block production rewrite: new extend_proofs_greedily set-cover algorithm, simplified build_block that works directly on proof maps, routing change in aggregate_committee_signatures (own proofs now bypass new→known promotion). Logic is correct; one redundant state transition and non-deterministic secondary sort within same-slot entries are minor efficiency/reproducibility concerns. |
| crates/storage/src/store.rs | Adds PayloadBuffer::grouped_by_data_root() and the public Store::known_payloads_by_data_root() wrapper; deduplication by SSZ-bytes of participants is sound. Also imports AggregatedSignatureProof. |
Sequence Diagram
sequenceDiagram
participant P as Proposer
participant BC as blockchain::store
participant ST as storage::store
P->>BC: produce_block_with_signatures(slot, validator_index)
BC->>ST: known_payloads_by_data_root()
ST-->>BC: HashMap<H256, Vec<AggregatedSignatureProof>>
BC->>ST: get_attestation_data_by_root(data_root) [per entry]
ST-->>BC: Option<AttestationData>
Note over BC: Build HashMap<H256, (AttestationData, Vec<Proof>)>
BC->>BC: build_block(head_state, slot, ...)
Note over BC: Sort entries by target.slot
loop Until no new found OR justification stable
Note over BC: Filter by known_block_roots & source == current_justified
BC->>BC: extend_proofs_greedily(proofs, ...)
alt found_new
BC->>BC: process_slots + process_block (candidate)
alt justification advanced
Note over BC: Update current_justified, continue
else no advance
Note over BC: break
end
else not found_new
Note over BC: break immediately
end
end
BC->>BC: process_slots + process_block (final block, for state_root)
BC-->>P: (Block, Vec<AggregatedSignatureProof>)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1079-1104
Comment:
**Redundant final state transition**
When the loop exits via the `"justification didn't advance"` branch (line 1083), the inner-loop state transition (lines 1075–1077) and the unconditional final transition at lines 1099–1101 are applied to an identical block: `aggregated_attestations` is not mutated between that break and the final computation. This means `process_slots` + `process_block` are called twice on the same inputs in that common case.
Consider caching the `post_state` from the last loop iteration and reusing it when the block contents didn't change after breaking. This is not a correctness issue, but it doubles the cost of the most common execution path (a single round of attestations with no justification advance).
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1030-1032
Comment:
**Non-deterministic ordering within same `target.slot`**
`aggregated_payloads` is a `HashMap<H256, …>`, so `aggregated_payloads.iter()` yields entries in arbitrary order. The sort on `data.target.slot` provides partial determinism, but entries that share the same `target.slot` retain their relative HashMap iteration order, which varies across runs.
Adding a secondary sort key (e.g., `data_root`) would make block composition fully deterministic and improve test reproducibility:
```suggestion
sorted_entries.sort_by_key(|(data_root, (data, _))| (data.target.slot, *data_root));
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 958-993
Comment:
**`remaining_indices` is never pruned of zero-coverage entries**
After the best proof is selected and its covered validators added to `covered`, all other indices that are now fully subsumed remain in `remaining_indices`. On the next iteration the inner `.map()` re-evaluates their coverage (all zero), `max_by_key` picks an arbitrary zero-coverage one, and the `new_covered.is_empty()` guard fires and breaks the loop. This is correct, but it means the final iteration always does O(remaining) work to discover what could be known immediately.
A small improvement: after `covered.extend(new_covered)`, remove any index whose participants are now fully covered, or filter `remaining_indices` before the `max_by_key` call to skip already-subsumed proofs.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "Merge branch 'main' into simplify-block-..." | Re-trigger Greptile
| if post_state.latest_justified != current_justified { | ||
| current_justified = post_state.latest_justified; | ||
| // Continue: new checkpoint may unlock more attestation data | ||
| } else { | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| // Fixed point reached: no new attestations found | ||
| if new_attestations.is_empty() { | ||
| break; | ||
| } | ||
|
|
||
| // Add new attestations and continue iteration | ||
| included_attestations.extend(new_attestations); | ||
| } | ||
|
|
||
| // Select existing proofs for the attestations to include in the block. | ||
| let (aggregated_attestations, aggregated_signatures) = | ||
| select_aggregated_proofs(&included_attestations, aggregated_payloads)?; | ||
|
|
||
| // Build final block | ||
| let attestations: AggregatedAttestations = aggregated_attestations | ||
| .try_into() | ||
| .expect("attestation count exceeds limit"); | ||
|
|
||
| let mut final_block = Block { | ||
| slot, | ||
| proposer_index, | ||
| parent_root, | ||
| state_root: H256::ZERO, | ||
| body: BlockBody { attestations }, | ||
| }; | ||
|
|
||
| // Recompute post-state with final block to get correct state root | ||
| let mut post_state = head_state.clone(); | ||
| process_slots(&mut post_state, slot)?; | ||
| process_block(&mut post_state, &final_block)?; | ||
|
|
||
| final_block.state_root = post_state.tree_hash_root(); | ||
|
|
||
| Ok((final_block, post_state, aggregated_signatures)) | ||
| } | ||
|
|
||
| /// Select existing aggregated proofs for attestations to include in a block. | ||
| /// | ||
| /// Fresh gossip aggregation happens at interval 2 (`aggregate_committee_signatures`). | ||
| /// This function only selects from existing proofs in the known aggregated payloads buffer | ||
| /// (proofs from previously received blocks and promoted gossip aggregations). | ||
| /// | ||
| /// Returns a list of (attestation, proof) pairs ready for block inclusion. | ||
| fn select_aggregated_proofs( | ||
| attestations: &[Attestation], | ||
| aggregated_payloads: &HashMap<SignatureKey, Vec<AggregatedSignatureProof>>, | ||
| ) -> Result<(Vec<AggregatedAttestation>, Vec<AggregatedSignatureProof>), StoreError> { | ||
| let mut results = vec![]; | ||
|
|
||
| for aggregated in aggregate_attestations_by_data(attestations) { | ||
| let data = &aggregated.data; | ||
| let message = data.tree_hash_root(); | ||
|
|
||
| let mut remaining: HashSet<u64> = validator_indices(&aggregated.aggregation_bits).collect(); | ||
|
|
||
| // Select existing proofs that cover the most remaining validators | ||
| while !remaining.is_empty() { | ||
| let Some(&target_id) = remaining.iter().next() else { | ||
| break; | ||
| }; | ||
|
|
||
| let Some(candidates) = aggregated_payloads | ||
| .get(&(target_id, message)) | ||
| .filter(|v| !v.is_empty()) | ||
| else { | ||
| break; | ||
| }; | ||
|
|
||
| let (proof, covered) = candidates | ||
| .iter() | ||
| .map(|p| { | ||
| let covered: Vec<_> = validator_indices(&p.participants) | ||
| .filter(|vid| remaining.contains(vid)) | ||
| .collect(); | ||
| (p, covered) | ||
| }) | ||
| .max_by_key(|(_, covered)| covered.len()) | ||
| .expect("candidates is not empty"); | ||
|
|
||
| // No proof covers any remaining validator | ||
| if covered.is_empty() { | ||
| break; | ||
| } | ||
|
|
||
| let aggregate = AggregatedAttestation { | ||
| aggregation_bits: proof.participants.clone(), | ||
| data: data.clone(), | ||
| }; | ||
| results.push((aggregate, proof.clone())); | ||
|
|
||
| metrics::inc_pq_sig_aggregated_signatures(); | ||
| metrics::inc_pq_sig_attestations_in_aggregated_signatures(covered.len() as u64); | ||
|
|
||
| for vid in covered { | ||
| remaining.remove(&vid); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| Ok(results.into_iter().unzip()) | ||
| Ok((final_block, aggregated_signatures)) |
There was a problem hiding this comment.
Redundant final state transition
When the loop exits via the "justification didn't advance" branch (line 1083), the inner-loop state transition (lines 1075–1077) and the unconditional final transition at lines 1099–1101 are applied to an identical block: aggregated_attestations is not mutated between that break and the final computation. This means process_slots + process_block are called twice on the same inputs in that common case.
Consider caching the post_state from the last loop iteration and reusing it when the block contents didn't change after breaking. This is not a correctness issue, but it doubles the cost of the most common execution path (a single round of attestations with no justification advance).
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1079-1104
Comment:
**Redundant final state transition**
When the loop exits via the `"justification didn't advance"` branch (line 1083), the inner-loop state transition (lines 1075–1077) and the unconditional final transition at lines 1099–1101 are applied to an identical block: `aggregated_attestations` is not mutated between that break and the final computation. This means `process_slots` + `process_block` are called twice on the same inputs in that common case.
Consider caching the `post_state` from the last loop iteration and reusing it when the block contents didn't change after breaking. This is not a correctness issue, but it doubles the cost of the most common execution path (a single round of attestations with no justification advance).
How can I resolve this? If you propose a fix, please make it concise.| // Sort by target.slot for deterministic processing order | ||
| let mut sorted_entries: Vec<_> = aggregated_payloads.iter().collect(); | ||
| sorted_entries.sort_by_key(|(_, (data, _))| data.target.slot); |
There was a problem hiding this comment.
Non-deterministic ordering within same
target.slot
aggregated_payloads is a HashMap<H256, …>, so aggregated_payloads.iter() yields entries in arbitrary order. The sort on data.target.slot provides partial determinism, but entries that share the same target.slot retain their relative HashMap iteration order, which varies across runs.
Adding a secondary sort key (e.g., data_root) would make block composition fully deterministic and improve test reproducibility:
| // Sort by target.slot for deterministic processing order | |
| let mut sorted_entries: Vec<_> = aggregated_payloads.iter().collect(); | |
| sorted_entries.sort_by_key(|(_, (data, _))| data.target.slot); | |
| sorted_entries.sort_by_key(|(data_root, (data, _))| (data.target.slot, *data_root)); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1030-1032
Comment:
**Non-deterministic ordering within same `target.slot`**
`aggregated_payloads` is a `HashMap<H256, …>`, so `aggregated_payloads.iter()` yields entries in arbitrary order. The sort on `data.target.slot` provides partial determinism, but entries that share the same `target.slot` retain their relative HashMap iteration order, which varies across runs.
Adding a secondary sort key (e.g., `data_root`) would make block composition fully deterministic and improve test reproducibility:
```suggestion
sorted_entries.sort_by_key(|(data_root, (data, _))| (data.target.slot, *data_root));
```
How can I resolve this? If you propose a fix, please make it concise.| let mut covered: HashSet<u64> = HashSet::new(); | ||
| let mut remaining_indices: HashSet<usize> = (0..proofs.len()).collect(); | ||
|
|
||
| AggregatedAttestation { | ||
| aggregation_bits: bits, | ||
| data, | ||
| } | ||
| }) | ||
| .collect() | ||
| while !remaining_indices.is_empty() { | ||
| // Pick proof covering the most uncovered validators | ||
| let best = remaining_indices | ||
| .iter() | ||
| .map(|&idx| { | ||
| let new_coverage: Vec<u64> = proofs[idx] | ||
| .participant_indices() | ||
| .filter(|vid| !covered.contains(vid)) | ||
| .collect(); | ||
| (idx, new_coverage) | ||
| }) | ||
| .max_by_key(|(_, cov)| cov.len()); | ||
|
|
||
| let Some((best_idx, new_covered)) = best else { | ||
| break; | ||
| }; | ||
| if new_covered.is_empty() { | ||
| break; | ||
| } | ||
|
|
||
| let proof = &proofs[best_idx]; | ||
| attestations.push(AggregatedAttestation { | ||
| aggregation_bits: proof.participants.clone(), | ||
| data: att_data.clone(), | ||
| }); | ||
| selected_proofs.push(proof.clone()); | ||
|
|
||
| metrics::inc_pq_sig_aggregated_signatures(); | ||
| metrics::inc_pq_sig_attestations_in_aggregated_signatures(new_covered.len() as u64); | ||
|
|
||
| covered.extend(new_covered); | ||
| remaining_indices.remove(&best_idx); | ||
| } |
There was a problem hiding this comment.
remaining_indices is never pruned of zero-coverage entries
After the best proof is selected and its covered validators added to covered, all other indices that are now fully subsumed remain in remaining_indices. On the next iteration the inner .map() re-evaluates their coverage (all zero), max_by_key picks an arbitrary zero-coverage one, and the new_covered.is_empty() guard fires and breaks the loop. This is correct, but it means the final iteration always does O(remaining) work to discover what could be known immediately.
A small improvement: after covered.extend(new_covered), remove any index whose participants are now fully covered, or filter remaining_indices before the max_by_key call to skip already-subsumed proofs.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 958-993
Comment:
**`remaining_indices` is never pruned of zero-coverage entries**
After the best proof is selected and its covered validators added to `covered`, all other indices that are now fully subsumed remain in `remaining_indices`. On the next iteration the inner `.map()` re-evaluates their coverage (all zero), `max_by_key` picks an arbitrary zero-coverage one, and the `new_covered.is_empty()` guard fires and breaks the loop. This is correct, but it means the final iteration always does O(remaining) work to discover what could be known immediately.
A small improvement: after `covered.extend(new_covered)`, remove any index whose participants are now fully covered, or filter `remaining_indices` before the `max_by_key` call to skip already-subsumed proofs.
How can I resolve this? If you propose a fix, please make it concise.
Motivation
leanSpec PR #464 ("simplify block production") rewrites
build_block()to work directly with aggregated payloads keyed byAttestationDatainstead of individualAttestationobjects. This eliminates an unnecessary round-trip: the old approach extracted individual attestations from aggregated payloads, filtered them, then re-aggregated. The new approach filters and selects proofs directly from the payload map.Spec PR: leanEthereum/leanSpec#464
Description
Storage layer (
crates/storage/src/store.rs)PayloadBuffer::grouped_by_data_root()— New method that groups buffer entries bydata_root(the tree hash root ofAttestationData), deduplicating proofs by comparing their participant bitfield bytes. ReturnsHashMap<H256, Vec<AggregatedSignatureProof>>.Store::known_payloads_by_data_root()— Public wrapper exposing the grouped payloads to the blockchain crate.AggregatedSignatureProofto the import list.Blockchain layer (
crates/blockchain/src/store.rs)New:
extend_proofs_greedily()Greedy set-cover algorithm that selects proofs maximizing new validator coverage for a given attestation data entry. Each selected proof produces one
AggregatedAttestationin the block body. This replaces bothaggregate_attestations_by_data()andselect_aggregated_proofs().Rewritten:
build_block()Before: Took
&[Attestation](individual per-validator attestations) and aHashMap<SignatureKey, Vec<AggregatedSignatureProof>>keyed by(validator_id, data_root). Used a fixed-point loop to collect individual attestations, then calledaggregate_attestations_by_data()to group them, thenselect_aggregated_proofs()to find proofs. Returned(Block, State, Vec<AggregatedSignatureProof>).After: Takes
HashMap<H256, (AttestationData, Vec<AggregatedSignatureProof>)>keyed bydata_root. Sorts entries bytarget.slotfor deterministic order, filters by known block roots and source checkpoint match, then callsextend_proofs_greedily()directly. Handles genesis edge case (slot 0 justified root = parent_root). Loops when justification advances to unlock attestation data with new source checkpoints. Returns(Block, Vec<AggregatedSignatureProof>).Simplified:
produce_block_with_signatures()Before: Called
iter_known_aggregated_payloads(), thenextract_latest_attestations()to reconstruct per-validatorAttestationobjects, then re-keyed the payloads bySignatureKeyforbuild_block().After: Calls
known_payloads_by_data_root()to get deduplicated proofs grouped by data_root, resolvesAttestationDatafor each viaget_attestation_data_by_root(), and passes the result directly tobuild_block().Routing change:
aggregate_committee_signatures()Own aggregation output now goes directly to
known_payloads(immediately usable for block building and fork choice) instead ofnew_payloads. Gossip-received aggregated attestations still go throughnew -> knownpromotion at intervals 0/4.Deleted
aggregate_attestations_by_data()— Functionality absorbed intoextend_proofs_greedily()select_aggregated_proofs()— Functionality absorbed intoextend_proofs_greedily()Backward Compatibility
produce_block_with_signatures()same signature and return type.on_gossip_attestation,on_gossip_aggregated_attestation,on_block_coreunchanged.extract_latest_attestationsand related methods unchanged.How to Test
For full validation:
make run-devnetwith--is-aggregator: verify blocks contain attestations, justified_slot and finalized_slot advance.claude/skills/test-pr-devnet/scripts/test-branch.sh