Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support weight-based eviction #24

Merged
merged 45 commits into from
Dec 31, 2021
Merged

Support weight-based eviction #24

merged 45 commits into from
Dec 31, 2021

Conversation

tatsuya6502
Copy link
Member

@tatsuya6502 tatsuya6502 commented Aug 9, 2021

  • Add support for weight based (relative size based) eviction.
  • Add support for unbound cache.

Edit on Dec 31, 2021

Finished implementing above features on the all caches:

  • future::Cache
  • sync::Cache
  • sync::SegmentedCache
  • unsync::Cache

Changes

API:

  • Cache and SegmentedCache
    • Changed the return type of max_capacity() from usize to Option<usize>.
    • Added builder() method.
  • CacheBuilder
    • Made the max capacity optional.
    • Added weigher method to set the weigher closure.

Internal Changes (Highlighting major changes):

  • sync::BaseCache
    • Updated do_insert_with_hash method to calculate the policy weight of the entry using the weigher closure.
  • sync::InnerCache
    • Changed to track the total weighed size.
    • Completely rewritten admit and find_cache_victim methods to do the weight-based admission.
    • Added evict_lru_entries method to evict LRU entries when the weighed size exceeds the max capacity.
      • This is needed to handle a case when an existing admitted entry was updated with a larger policy weight.
  • sync::ValueEntry
    • Updated to hold the policy weight of the entry via EntryInfo enum.
    • Added EntryInfo enum to reduce the number of Arc usages:
      • This may have positive impacts on memory consumption and performance.
      • But also this may have negative impacts on performance as it does dynamic dispatches.

Notes

  • We still need to evaluate the aggregated victim strategy to see if it will improve cache hit rate in various real-world workloads (e.g. by using Caffeine's simulator). See this comment for details.

- Add `total_weight` field and optional `max_weight` and `weighter` closure fields
  to `InnerCache`.
- Rename `max_capacity` to `max_entries` and make it optional.
- Update `sync` and child methods of `InnerCache` to check and update the `total_weight`
  value during processing inserts, updates, invalidations and evictions.
It is currently failing because bitflags v1.3.1 is incompatible with the MSRV.
This issue will be addressed by another pull request.
future::Cache::get_or_try_insert_with_hash_and_fun
- Implement aggregated victims strategy for cost-based eviction.
- Rename max_entry back to max_capacity, and remove an optional max_weight
  field from InnerCache.
- Rename an internal structure TotalWeight to TotalCost.
- Lightly refactored.
- Rename many internal items, for example:
    - TotalCost struct -> WeightedSize struct
    - cost() method -> weigh() method
    - cost variable -> policy_weight variable
    - NOTE: Cost usually means the miss penalty in cache.
- Add EntrySizeAndFrequency struct for admit() method.
In order to reduce chances to pollute the cache main space, restore the old admission
behavior, which requires the candidate frequency must be higher than the aggregated
frequency of potential victims.
- Rename KeyValueEntry struct to KVEntry.
- Updated WriteOp::Remove enum variant to store KVEntry.
- Fix a bug in admit method to go into an infinite loop when the LRU entry
   has been concurrently invalidated.
- Also add a retry limit to admit method.
- Add `weighter` method to the `CacheBuilder`s.
- Add unit tests for `sync::Cache` with weighter closure registered.
Replace Vec in AdmissionResult with SmallVec.
Copy a unit test size_aware_admission from sync::Cache to future::Cache.
src/common.rs Outdated Show resolved Hide resolved
@tatsuya6502 tatsuya6502 added this to the v0.6.0 milestone Aug 28, 2021
@tatsuya6502 tatsuya6502 self-assigned this Aug 28, 2021
@tatsuya6502 tatsuya6502 added the enhancement New feature or request label Sep 14, 2021
@tatsuya6502 tatsuya6502 modified the milestones: v0.6.0, v0.7.0 Sep 18, 2021
- Implement size-aware cache management to the unsync cache.
- Rename `KVEntry` to `KvEntry`.
Disable Clippy until we figure out what to do on the following lint:
https://rust-lang.github.io/rust-clippy/master/index.html#non_send_fields_in_send_ty

Relates to #54

(cherry picked from commit 646beb0)
This will not compile.

Conflicts:
  .github/workflows/CI.yml
  .vscode/settings.json
  CHANGELOG.md
  README.md
  src/sync/base_cache.rs
  src/unsync/cache.rs
Fix after merging the master branch. Note that added crossbeam-utils crate
to the dependency for `AtomicCell` (using for tracking weighted size).
- Change `weigher`'s return type from `u64` to `u32`.
- Refactor data types `ValueEntry`, `KeyDate` and `KeyHashDate` to
   reduce the number of `Arc` pointers used by adding `EntryInfo` struct.
Update the `weigher` examples to avoid `u32` overflows.
Implement weighted size handling in `sync::BaseCache` for when updating
an existing cache entry.
Change `WriteOp` back to having `Upsert`.
Add `evict_lru_entries` to `sync::BaseCache`, which will evict LRU entries
when cache's capacity is exceeded by updating existing entries with bigger
weighted sizes.
Add `evict_lru_entries` to `unsync::Cache`, which will evict LRU entries
when cache's capacity is exceeded by updating existing entries with bigger
weighted sizes.
Rename `weighted_size` in `ValueEntry` to `policy_weight`.
Add `sync::ValueEntryBuilder` for `sync::BaseCache`.
Change `sync::ValueEntryBuilder` and `sync::entry_info::EntryInfo` from dynamic
dispatch (by trait objects) to enum dispatch. A simple benchmark showed enum
dispatch is slightly (a few %) faster than dynamic dispatch in our use case.
Finish implementing size-aware cache management on `unsync::Cache`.
Update `sync::SegmentedCache` to support weigher function.
Adding finishing touch.
@tatsuya6502 tatsuya6502 marked this pull request as ready for review December 31, 2021 06:28
@tatsuya6502
Copy link
Member Author

Finally, I have got some time to work on this PR. Now it is ready to merge.

Copy link
Member Author

@tatsuya6502 tatsuya6502 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging.

Although all unit tests are passing and I have reviewed the diff several times, I am a bit worried for regressions because this PR is huge (+2,009 -539). I will not release this immediately, but will spend next few days to run additional tests to shake out some bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants