Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement record based Crucible reference counting #6805

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Commits on Oct 9, 2024

  1. Implement record based Crucible reference counting

    Crucible volumes are created by layering read-write regions over a
    hierarchy of read-only resources. Originally only a region snapshot
    could be used as a read-only resource for a volume. With the
    introduction of read-only regions (created during the region snapshot
    replacement process) this is no longer true!
    
    Read-only resources can be used by many volumes, and because of this
    they need to have a reference count so they can be deleted when they're
    not referenced anymore. The region_snapshot table uses a
    `volume_references` column, which counts how many uses there are. The
    region table does not have this column, and more over a simple integer
    works for reference counting but does not tell you _what_ volume that
    use is from. This can be determined (see omdb's validate volume
    references command) but it's information that is tossed out, as Nexus
    knows what volumes use what resources! Instead, record what read-only
    resources a volume uses in a new table.
    
    As part of the schema change to add the new `volume_resource_usage`
    table, a migration is included that will create the appropriate records
    for all region snapshots.
    
    In testing, a few bugs were found: the worst being that read-only
    regions did not have their read_only column set to true. This would be a
    problem if read-only regions are created, but they're currently only
    created during region snapshot replacement. To detect if any of these
    regions were created, find all regions that were allocated for a
    snapshot volume:
    
        SELECT id FROM region
        WHERE volume_id IN (SELECT volume_id FROM snapshot);
    
    A similar bug was found in the simulated Crucible agent.
    
    This commit also reverts oxidecomputer#6728, enabling region snapshot replacement
    again - it was disabled due to a lack of read-only region reference
    counting, so it can be enabled once again.
    jmpesp committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    ae4282a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    66c6797 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    410c79d View commit details
    Browse the repository at this point in the history

Commits on Oct 16, 2024

  1. Explicitly separate region deletion code

    The garbage collection of read/write regions must be separate from
    read-only regions:
    
    - read/write regions are garbage collected by either being deleted
      during a volume soft delete, or by appearing later during the "find
      deleted volume regions" section of the volume delete saga
    
    - read-only regions are garbage collected only in the volume soft delete
      code, when there are no more references to them
    
    `find_deleted_volume_regions` was changed to only operate on read/write
    regions, and no longer returns the optional RegionSnapshot object: that
    check was moved from the volume delete saga into the function, as it
    didn't make sense that it was separated.
    
    This commit also adds checks to validate that invariants related to
    volumes are not violated during tests. One invalid test was deleted
    (regions will never be deleted when they're in use!)
    
    In order to properly test the separate region deletion routines, the
    first part of the fixes for dealing with deleted volumes during region
    snapshot replacement were brought in from that branch: these are the
    changes to region_snapshot_replacement_step.rs and
    region_snapshot_replacement_start.rs.
    jmpesp committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    2ff34d5 View commit details
    Browse the repository at this point in the history
  2. be explicit in comment!

    jmpesp committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    221751a View commit details
    Browse the repository at this point in the history
  3. reduce denting

    jmpesp committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    00d9f38 View commit details
    Browse the repository at this point in the history
  4. TODO be smart enough

    jmpesp committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    412d665 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    0a45763 View commit details
    Browse the repository at this point in the history
  6. fmt

    jmpesp committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    89744ea View commit details
    Browse the repository at this point in the history
  7. add CONSTRAINT to volume_resource_usage table to separate two enum uses

    add comment to table as well
    jmpesp committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    bd2ef84 View commit details
    Browse the repository at this point in the history