Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements to DefaultThreadContextMap #2319

Closed
jengebr opened this issue Feb 23, 2024 · 2 comments
Closed

Performance improvements to DefaultThreadContextMap #2319

jengebr opened this issue Feb 23, 2024 · 2 comments

Comments

@jengebr
Copy link
Contributor

jengebr commented Feb 23, 2024

DefaultThreadContextMap maintains non-blocking thread safety by guaranteeing immutability, and performing copy-and-modify operations when adding or removing. The referenced pull request improves performance while maintaining functionality by creating a custom data structure (UnmodifiableArrayBackedMap). According to the attached JMH benchmark, it outperforms the current implementation in every way - except that search operations (get/containsKey) become linear-time rather than constant. This appears to be a good tradeoff, especially in the common cases when the map is small (< 15 entries). I welcome discussion on this point.

Details: DefaultThreadContextMap currently implements the copy-and-modify by allocating a new HashMap (with a padded array), iterating through the existing HashMap (reading null buckets, and chasing nested Nodes), then inserting each element into the new HashMap (Node allocation, hash/equals calculations). The new implementation reduces copying to a single array allocation and a single call to System.arrayCopy(.

The JMH benchmarks attached to this issue show that copying and adding one entry to the new map is 1.5x faster when the existing map is empty, and the delta increases steadily up to 5x faster when 75 items are already present. Similarly, adding 5 entries at a time is 2x faster when the map is empty, and the delta climbs to 5x faster when the map contains 75 items.

Separate profiling suggests a reduction in memory allocation MB/s around 20%, mostly by eliminating the Node instances.

Other JMH highlights:

  1. Iteration across the entrySet() improves by 8x to 20x.
  2. get() times are faster with 0 or 1 items in the map, but slower after 5+ items are present.

JMH benchmark class:
MapSpeedBenchmark.java.txt

JMH benchmark results:
MapSpeedBenchmark_results.txt

@jengebr
Copy link
Contributor Author

jengebr commented Feb 23, 2024

Pull request: #2321

@jengebr
Copy link
Contributor Author

jengebr commented Feb 26, 2024

Closing based on discussion on pull request. No changes made.

@jengebr jengebr closed this as completed Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant