-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPEDGE-919: MicroShift: Replacing upstream TopoLVM with a minified version of LVMS #1601
OCPEDGE-919: MicroShift: Replacing upstream TopoLVM with a minified version of LVMS #1601
Conversation
@jakobmoellerdev: This pull request references OCPEDGE-919 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.16." or "openshift-4.16.", but it targets "openshift-4.17" instead. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@jakobmoellerdev: This pull request references OCPEDGE-919 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.16." or "openshift-4.16.", but it targets "openshift-4.17" instead. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
48eea89
to
f298d84
Compare
/jira-refresh |
85238f5
to
3201d20
Compare
@jakobmoellerdev: This pull request references OCPEDGE-919 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
c3df851
to
72ca88c
Compare
LGTM |
ca18c5b
to
58a0bc0
Compare
@jakobmoellerdev: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
* The LVMCluster is configured with `.spec.configurationPolicy=ExternalConfiguration` | ||
which will trigger a deployment of the driver which can use the existing | ||
deviceClasses and storageClasses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an alternative approach, have we considered a migration where /etc/microshift/lvmd.yaml
is read and transposed into a LVMCluster
object? The lvmd.yaml file is a user facing API config, so I imagine it would be eventually deprecated and phased out in favor of a user defined LVMCluster
objects. By coding this migration into MicroShift, we would be setting ourselves up for success when we want to transition users away form TopoLVM paradigms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this has been discussed in the last architecture call and is also (at least to the best of my knowledge) answered in https:/openshift/enhancements/pull/1601/files#diff-e0dc1c4db255868fe6b65d8b8187629884953cb3900650f9b9a755fc24fcd33cR151
To your question
/etc/microshift/lvmd.yaml is read and transposed into a LVMCluster object?
The reason that is impractical is because lvmd.yaml is the single source of truth for TopoLVM and the CSI Driver, not LVMCluster, meaning that transposing would just mean we would write them right back to lvmd.yaml on the next reconciliation. the lvmd.yaml is the user facing API of the underlying CSI Driver, LVMCluster is the user facing API of LVMS. They do overlap, but the feature set of LVMS is significantly reduced by the fact that lvcreate-options and custom volume groups are not allowed.
We (@suleymanakbas91 @brandisher @DanielFroehlich and myself) believe that doing this would mean explosion in support cases we would be needing to tackle since potentially all external volume groups and logical volumes suddenly become supported scenarios for LVMCluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess to summarize: The only way we would be able to make this work is by
- Creating an API Extension for LVMCluster that is covering lvcreate-options in its entirety, as well as volume group creation settings in its entirety.
- Creating an API Extension for LVMCluster that allows for static provisioning, similar to what MicroShifts install time volume groups currently do
- Adding Supported Scenarios / Tests for all known dynamic LVM configurations in use (LVM Software Raid being the most problematic one because it conflicts with LVM Thin Pool configurations) in both QE setup as well as E2E
- Creating a migration tool just for MicroShift which can transition deviceClasses discovered outside of LVMCluster to be put into LVMCluster.
This was mainly rejected due to 1/2/3, since we believe that creating a "wrapper" API around the original low-level API is not useful at best and hard to maintain at worst if we expose / transpose all settings 1:1 just into a CRD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TopoLVM and the CSI Driver, not LVMCluster, meaning that transposing would just mean we would write them right back to lvmd.yaml on the next reconciliation.
Ah, that's the flaw in my logic :D Thanks for the well thought answer!
* If LVMS does not detect a lvmd.yaml file (at runtime), it will error out, | ||
as an external configuration is now mandatory for correct configuration. | ||
Note that if the lvmd.yaml file is not present when starting MicroShift, we should | ||
[keep the current behavior of not enabling LVMS and skip deploying it from the assets.](https:/openshift/microshift/blob/main/pkg/components/storage.go#L83) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is an existing lvmd.yaml a current requirement for LVMS? I had thought LVMS wrote the lvmd.yaml prior to deploying the topolvm containers. If that's not the case, then...
Continuing the above comment, this is essentially coding topolvm validations into LVMS, when it should be the opposite direction. LVMS should infer a missing lvmd.yaml to mean "don't deploy lvms" until we officially deprecate the lvmd.yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will reference here my comment above and mention that lvmd.yaml is the single source of truth for TopoLVM. So when LVMS does not detect LVMClusters it does not attempt to setup a vgmanager which would setup the lvmd.yaml. However this requires running the lvm-operator in advance which is just the way that LVMS is working within the OLM world (operator deploying other operator based on CRD).
LVMS should infer a missing lvmd.yaml to mean "don't deploy lvms" until we officially deprecate the lvmd.yaml.
In principal this does work, however for LVMS to infer a missing lvmd.yaml we would need a runtime component in MicroShift before we actually want to run LVMS. (a component that discovers lvmd.yaml and determines if to deploy). This is currently done at install time of MicroShift and imo a much better solution (as it does not require a runtime component) so I did not attempt to change this flow.
For lvmd.yaml to be deprecated it must no longer be used by the CSI Driver. We would need to create a RFE to change the fundamental configuration of TopoLVM which will most likely get rejected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I had only half explained my thought process - I was thinking that a backend lvmd.yaml would be where we'd want to go, either as a config map in the cluster, or under the datadir that users shouldn't mess with /var/lib/microshift/...
.
That said, I see your point and agree.
If we would switch the stance to migrate existing MicroShift Installations fully to LVMCluster, we would need to | ||
maintain this migration while ensuring that existing RAID configurations are still supported. This is not feasible | ||
because a custom created volume group without constraints is introducing a lot of complexity to a potential test matrix | ||
that would explode the scope of LVMCluster testing and supported scenarios for OpenShift, which we expressively not want. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's happening in reality? Do we have any MicroShift customers who are utilizing the raid settings in the lvmd.yaml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I would have to defer to you and @DanielFroehlich here. I do not know the MicroShift customers.
That being said, the RFE is written in a way that handles potentially supported cases. It starts with looking at https://access.redhat.com/documentation/en-us/red_hat_build_of_microshift/4.15/html/storage/microshift-storage-plugin-overview#microshift-lvmd-config-example-basic_microshift-storage-plugin-overview where Thick Provisioning would be a case that wasn't even supported until 4.16, and it continues to LVM Software RAID, LVM stripe configurations or pretty much any other custom advanced volume group settings.
Thanks for the well written responses. This LGTM |
LGTM |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jerpeter1 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jerpeter1 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This enhancement describes the proposed strategy to move from repackaged
upstream TopoLVM to a minified version of LVMS in MicroShift starting with
4.164.17.