Skip to content

[Enhancement](memory) Add ConcurrentLong2ObjectHashMap and ConcurrentLong2LongHashMap#61332

Open
dataroaring wants to merge 1 commit intomasterfrom
feature/concurrent-fastutil-maps
Open

[Enhancement](memory) Add ConcurrentLong2ObjectHashMap and ConcurrentLong2LongHashMap#61332
dataroaring wants to merge 1 commit intomasterfrom
feature/concurrent-fastutil-maps

Conversation

@dataroaring
Copy link
Contributor

Summary

Add two thread-safe primitive-key concurrent hash maps built on fastutil, designed as drop-in replacements for ConcurrentHashMap<Long, V> and ConcurrentHashMap<Long, Long> in memory-sensitive FE paths.

  • ConcurrentLong2ObjectHashMap<V> — replaces ConcurrentHashMap<Long, V>
  • ConcurrentLong2LongHashMap — replaces ConcurrentHashMap<Long, Long>

Why

ConcurrentHashMap<Long, V> costs ~64 bytes per entry due to Long boxing, Node wrapper, and segment overhead. These fastutil-based maps reduce that to ~16 bytes per entry — a 4x memory reduction.

In Doris FE, several critical data structures use ConcurrentHashMap<Long, V> at tablet/partition scale (millions of entries), making this a significant memory optimization opportunity.

Design

  • Segment-based locking (default 16 segments) for concurrent throughput, similar to Java 7's ConcurrentHashMap design
  • Full Map interface compatibility for drop-in replacement
  • Atomic operations: putIfAbsent, computeIfAbsent, replace, remove(key, value)
  • Thread-safe iteration via snapshot-based entrySet()/keySet()/values()

Memory comparison

Collection Per-entry overhead 1M entries
ConcurrentHashMap<Long, V> ~64 bytes ~61 MB
ConcurrentLong2ObjectHashMap<V> ~16 bytes ~15 MB
ConcurrentHashMap<Long, Long> ~80 bytes ~76 MB
ConcurrentLong2LongHashMap ~16 bytes ~15 MB

Test plan

  • ConcurrentLong2ObjectHashMapTest — 432 lines covering put/get/remove, putIfAbsent, computeIfAbsent, replace, concurrent writes from multiple threads, iteration consistency, empty map edge cases
  • ConcurrentLong2LongHashMapTest — 455 lines covering CRUD, default value semantics, concurrent operations, atomic operations, iteration, edge cases

🤖 Generated with Claude Code

…Long2LongHashMap

Add thread-safe primitive-key concurrent hash maps built on fastutil,
designed to replace ConcurrentHashMap<Long, V> and ConcurrentHashMap<Long, Long>
in memory-sensitive FE paths.

These maps eliminate Long autoboxing overhead and reduce per-entry memory
from ~64 bytes (ConcurrentHashMap) to ~16 bytes, a 4x improvement.

Key design:
- Segment-based locking (default 16 segments) for concurrent throughput
- Full Map interface compatibility for drop-in replacement
- Atomic putIfAbsent, computeIfAbsent, replace, remove operations
- Comprehensive unit tests covering CRUD, concurrency, iteration,
  edge cases, and default value semantics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 14, 2026 12:11
@Thearas
Copy link
Contributor

Thearas commented Mar 14, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds two new segmented-lock concurrent hash maps for FE that use fastutil primitive-key/value maps to reduce memory overhead compared to ConcurrentHashMap<Long, ...> while preserving familiar APIs and providing snapshot-based iteration.

Changes:

  • Introduce ConcurrentLong2ObjectHashMap<V>: concurrent long→object map with per-segment RW locks and atomic compute/merge-style operations.
  • Introduce ConcurrentLong2LongHashMap: concurrent long→long map with per-segment RW locks plus an atomic addTo counter helper.
  • Add comprehensive JUnit tests for correctness, concurrency behavior, iteration snapshots, and Gson round-trip/format compatibility.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
fe/fe-core/src/main/java/org/apache/doris/common/ConcurrentLong2ObjectHashMap.java New segmented concurrent long→object map implementation.
fe/fe-core/src/main/java/org/apache/doris/common/ConcurrentLong2LongHashMap.java New segmented concurrent long→long map implementation with addTo.
fe/fe-core/src/test/java/org/apache/doris/common/ConcurrentLong2ObjectHashMapTest.java New unit tests for object map behavior, concurrency, and Gson.
fe/fe-core/src/test/java/org/apache/doris/common/ConcurrentLong2LongHashMapTest.java New unit tests for long map behavior, concurrency, and Gson.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +263 to +269
if (seg.map.containsKey(key)) {
return seg.map.get(key);
}
long newValue = mappingFunction.applyAsLong(key);
seg.map.put(key, newValue);
return newValue;
} finally {
Comment on lines +451 to +453
// Boxed get via Map<Long,Long> interface returns null for missing keys
Long boxedResult = map.getOrDefault(999L, map.defaultReturnValue());
Assertions.assertEquals(0L, boxedResult);
Comment on lines +50 to +51
* <p><b>Important:</b> All compound operations from both {@link Long2LongMap} and {@link Map}
* interfaces are overridden to ensure atomicity within a segment's write lock.
Comment on lines +26 to +29
import it.unimi.dsi.fastutil.longs.LongBinaryOperator;
import it.unimi.dsi.fastutil.longs.LongOpenHashSet;
import it.unimi.dsi.fastutil.longs.LongSet;
import it.unimi.dsi.fastutil.objects.ObjectArrayList;
Comment on lines +255 to +257
V newValue = mappingFunction.apply(key);
seg.map.put(key, newValue);
return newValue;
Comment on lines +272 to +274
V newValue = mappingFunction.get(key);
seg.map.put(key, newValue);
return newValue;
Comment on lines +50 to +51
* <p><b>Important:</b> All compound operations (computeIfAbsent, computeIfPresent, compute, merge)
* from both {@link Long2ObjectMap} and {@link Map} interfaces are overridden to ensure atomicity
Comment on lines +223 to +227
void testNullValues() {
ConcurrentLong2ObjectHashMap<String> map = new ConcurrentLong2ObjectHashMap<>();
map.put(1L, null);
Assertions.assertTrue(map.containsKey(1L));
Assertions.assertNull(map.get(1L));
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 14, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants