-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements to DefaultThreadContextMap #2319
Comments
jengebr
added a commit
to jengebr/jengebr_perf
that referenced
this issue
Feb 23, 2024
Pull request: #2321 |
Closing based on discussion on pull request. No changes made. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
DefaultThreadContextMap
maintains non-blocking thread safety by guaranteeing immutability, and performing copy-and-modify operations when adding or removing. The referenced pull request improves performance while maintaining functionality by creating a custom data structure (UnmodifiableArrayBackedMap
). According to the attached JMH benchmark, it outperforms the current implementation in every way - except that search operations (get/containsKey) become linear-time rather than constant. This appears to be a good tradeoff, especially in the common cases when the map is small (< 15 entries). I welcome discussion on this point.Details:
DefaultThreadContextMap
currently implements the copy-and-modify by allocating a newHashMap
(with a padded array), iterating through the existingHashMap
(reading null buckets, and chasing nestedNode
s), then inserting each element into the newHashMap
(Nod
e allocation, hash/equals calculations). The new implementation reduces copying to a single array allocation and a single call toSystem.arrayCopy(
.The JMH benchmarks attached to this issue show that copying and adding one entry to the new map is 1.5x faster when the existing map is empty, and the delta increases steadily up to 5x faster when 75 items are already present. Similarly, adding 5 entries at a time is 2x faster when the map is empty, and the delta climbs to 5x faster when the map contains 75 items.
Separate profiling suggests a reduction in memory allocation MB/s around 20%, mostly by eliminating the
Node
instances.Other JMH highlights:
entrySet()
improves by 8x to 20x.get()
times are faster with 0 or 1 items in the map, but slower after 5+ items are present.JMH benchmark class:
MapSpeedBenchmark.java.txt
JMH benchmark results:
MapSpeedBenchmark_results.txt
The text was updated successfully, but these errors were encountered: