Skip to content

Phase 1: Make streaming pipeline chunk-emitting#583

Draft
aram356 wants to merge 21 commits intospecs/streaming-response-optimizationfrom
feature/streaming-pipeline-phase1
Draft

Phase 1: Make streaming pipeline chunk-emitting#583
aram356 wants to merge 21 commits intospecs/streaming-response-optimizationfrom
feature/streaming-pipeline-phase1

Conversation

@aram356
Copy link
Collaborator

@aram356 aram356 commented Mar 26, 2026

Summary

Make the internal streaming pipeline truly chunk-emitting so processors emit output incrementally instead of buffering entire documents. This is the foundation for Phase 2 (streaming responses to the client via StreamingBody) but ships independently with immediate memory and correctness benefits.

Closes #568, closes #569, closes #570, closes #571, closes #572.
Part of epic #563.

What changed

Encoder finalization fix (process_through_compression):

  • Changed signature from W (by value) to &mut W so callers retain ownership
  • Replaced flush() + drop(encoder) with explicit encoder.finish() / encoder.into_inner()
  • Finalization errors now propagate instead of being silently swallowed
  • Fixes pre-existing correctness issue for deflate/brotli, prevents regression when gzip moves to this path

All compression paths now chunk-based:

  • process_gzip_to_gzip: replaced read_to_end() with process_through_compression pattern
  • decompress_and_process (used by *_to_none paths): replaced read_to_end() with chunk read loop
  • All 6 compression paths now follow the same structure: read chunk → process → write → repeat

Unified process_chunks method:

  • Extracted shared chunk read/process/write loop from process_uncompressed, process_through_compression, and decompress_and_process into a single process_chunks method
  • Reduces 3 near-identical implementations to 1

HtmlRewriterAdapter incremental streaming:

  • lol_html rewriter created eagerly in constructor with Rc<RefCell<Vec<u8>>> shared output sink
  • process_chunk() now emits output on every call, not just is_last
  • Adapter is single-use (reset() is a no-op — Settings consumed by constructor)

HtmlWithPostProcessing adaptation:

  • Added output accumulation buffer for the is_last chunk when post-processors are registered
  • Intermediate chunks pass through directly; final chunk triggers post-processing on accumulated output
  • When no post-processors are registered (the streaming path), it's a pure passthrough

Files changed

File Lines What
streaming_processor.rs +380 -329 All pipeline changes above
html_processor.rs +29 -1 Post-processor accumulation for is_last

Verification

  • cargo test --workspace — 748 passed, 0 failed
  • cargo clippy --workspace --all-targets --all-features -- -D warnings — clean
  • cargo fmt --all -- --check — clean
  • npx vitest run — 282 passed
  • cargo build --release --target wasm32-wasip1 — success

Test plan

  • Existing streaming_processor tests updated for incremental output
  • New deflate round-trip test
  • New gzip round-trip test (chunk-based)
  • New gzip-to-none test
  • HtmlRewriterAdapter per-chunk emission test
  • Full pipeline integration test with HTML rewriting
  • WASM build succeeds

aram356 added 17 commits March 26, 2026 09:08
…ation test

- Add debug-level logging to process_chunks showing total bytes
  read and written per pipeline invocation
- Add brotli-to-brotli round-trip test to cover the into_inner()
  finalization path
- Add test proving HtmlWithPostProcessing accumulates output when
  post-processors are registered while streaming path passes through
- Group std imports together (cell, io, rc) before external crates
- Document supported compression combinations on PipelineConfig
- Remove dead-weight byte counters from process_chunks hot loop
- Fix stale comment referencing removed process_through_compression
- Fix brotli finalization: use drop(encoder) instead of into_inner()
  to make the intent clear (CompressorWriter writes trailer on drop)
- Document reset() as no-op on HtmlRewriterAdapter (single-use)
- Add brotli round-trip test covering into_inner finalization path
- Add gzip HTML rewriter pipeline test (compressed round-trip with
  lol_html, not just StreamingReplacer)
- Add HtmlWithPostProcessing accumulation vs streaming behavior test
- Add Eq derive to Compression enum (all unit variants, trivially correct)
- Brotli finalization: use into_inner() instead of drop() to skip
  redundant flush and make finalization explicit
- Document process_chunks flush semantics: callers must still call
  encoder-specific finalization after this method returns
- Warn when HtmlRewriterAdapter receives data after finalization
  (rewriter already consumed, data would be silently lost)
- Make HtmlWithPostProcessing::reset() a true no-op matching its doc
  (clearing auxiliary state without resetting rewriter is inconsistent)
- Document extra copying overhead on post-processor path vs streaming
- Assert output content in reset-then-finalize test (was discarded)
- Relax per-chunk emission test to not depend on lol_html internal
  buffering behavior — assert concatenated correctness + at least one
  intermediate chunk emitted
- Add test that feeds multiple chunks through HtmlWithPostProcessing
  with an active post-processor (should_process returns true, mutates
  HTML). Verifies the post-processor receives the complete accumulated
  document and its mutations appear in the output.
- Make flush semantics per-codec explicit in process_chunks doc:
  flate2 needs finish() after flush, brotli is finalized by flush
@aram356 aram356 self-assigned this Mar 26, 2026
aram356 added 4 commits March 26, 2026 11:46
lol_html fragments text nodes across chunk boundaries when fed
incrementally. This breaks script rewriters (NextJS __NEXT_DATA__,
GTM) that expect complete text content — a split domain like
"google" + "tagmanager.com" would silently miss the rewrite.

Add dual-mode HtmlRewriterAdapter:
- new(): streaming mode, emits output per chunk (no script rewriters)
- new_buffered(): accumulates input, feeds lol_html in one write()
  call on is_last (script rewriters registered)

create_html_processor selects the mode based on whether
script_rewriters is non-empty. This preserves the old behavior
(single-pass processing) when rewriters need it, while enabling
streaming when they don't.

Add regression test proving lol_html does fragment text across
chunk boundaries, confirming the issue is real.
lol_html fragments text nodes across input chunk boundaries. Script
rewriters (NextJS __NEXT_DATA__, GTM) expect complete text content
and would silently miss rewrites on split fragments.

Add dual-mode HtmlRewriterAdapter:
- new(): streaming, emits output per chunk (no script rewriters)
- new_buffered(): accumulates input, single write() on is_last

create_html_processor selects mode based on script_rewriters. This
preserves correctness while enabling streaming for configs without
script rewriters. Phase 3 will make rewriters fragment-safe.

Add regression test proving lol_html does fragment text nodes.
Add section to spec explaining the lol_html text fragmentation issue,
the dual-mode HtmlRewriterAdapter workaround (Phase 1), and the
planned fix to make script rewriters fragment-safe (Phase 3, #584).
- Add post-finalization warning to buffered path (was only in streaming)
- Add buffered_adapter_prevents_text_fragmentation test proving
  new_buffered() delivers complete text to lol_html handlers
- Fix spec: html_processor.rs is changed (selects adapter mode), and
  script_rewriters do require buffered mode (not "unaffected")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant