Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for running controller outside of cluster #102

Merged
merged 1 commit into from
Jan 27, 2020

Conversation

MartinWeindel
Copy link
Contributor

What this PR does / why we need it:
If the control plane runs in another cluster, it is needed to specify kubeconfig path for controller and syncer.

Release note:

Support for running the CSI controller outside the controlled cluster.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 2, 2019
@k8s-ci-robot
Copy link
Contributor

Welcome @MartinWeindel!

It looks like this is your first PR to kubernetes-sigs/vsphere-csi-driver 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/vsphere-csi-driver has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @MartinWeindel. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 2, 2019
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 2, 2019
@MartinWeindel
Copy link
Contributor Author

/assign @codenrhoden

@divyenpatel
Copy link
Member

@MartinWeindel what is the use case of running controller outside of the cluster? Isn't this a security hole?

@MartinWeindel
Copy link
Contributor Author

The use case is providing Kubernetes clusters on vSphere by Gardener.
In Gardener, it is standard procedure to run the control plane for an end-user "shoot" cluster on a so called "seed" cluster, which controls up to hundreds of shoot clusters.
This means the end-user "shoot" cluster is masterless and consists only of worker nodes.
Of course the communication to the K8s API server running on the control plane must be secured with TLS and signed certificates.
Take a look on the Gardener architecture diagram if you are interested in more details.

klog.Errorf("BuildConfigFromFlags failed %q", err)
return nil, err
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about defaulting to an in cluster config if BuildConfigFromFlags fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be very confusing. You specify a kubeconfig, but if something fails the controller suddenly tries to control the cluster it is running on.
Better to fail early and report it. If you want to use in-cluster config, just do not provide a kubeconfig.
BTW, your colleges of the vSphere cloud manager are using exactly the same logic, see https:/kubernetes/cloud-provider-vsphere/blob/master/pkg/common/kubernetes/kubernetes.go#L38-L44

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPI have different use cases from CSI. However, on second thought, i agree with the benefits of failing early.

nodes.nodeRegister(obj)
}

func (nodes *Nodes) nodeUpdate(oldObj interface{}, newObj interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you done any testing to validate if nodeUpdate works as expected? I think here that you'd need to unregister the old node object and register the new one if that passes (or vice versa). Otherwise we'd have a long list of nodes registered representing the same node object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A node update typically updates its status fields. Especially it is forbidden to change the providerID of the spec or its name. Therefore the node registration happens with the same values and the node manager is hopefully idempotent. I'm not an expert here, but removing the node registration on an update of its status looks to me like a very dangerous operation.

Copy link

@mandelsoft mandelsoft Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is a timing behaviour. If the CSI controller is already running if new nodes join the cluster
the node object is created first without attributes required by the CSI controller. They are later added by the cloud controller.
But this is then no create event anymore, but an update event. Therefore it is not possible to ignore update events in the CSI controller.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mandelsoft exactly! Not just in the case of new nodes, but also if CSI is initialized before CPI adds ProviderID to all nodes, we have a situation where not all nodes are registered with the ProviderID set. This is the use case that i see for nodeUpdate.
@MartinWeindel it seems that nodeRegister will modify the existing node entry and not add a new one. But we still need some testing to verify

@MartinWeindel
Copy link
Contributor Author

friendly ping
Have you found time for "some testing to verify"? Or is there still anything to be resolved before this PR can be accepted?
Thanks!

@RaunakShah
Copy link
Contributor

@divyenpatel can you review this PR?

@divyenpatel
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 24, 2020
@divyenpatel
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: divyenpatel, MartinWeindel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 27, 2020
@divyenpatel
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 27, 2020
@k8s-ci-robot k8s-ci-robot merged commit 8487000 into kubernetes-sigs:master Jan 27, 2020
jsafrane pushed a commit to jsafrane/vsphere-csi-driver that referenced this pull request Jan 3, 2024
…ency-openshift-4.16-vmware-vsphere-syncer

OCPBUGS-24961: Updating vmware-vsphere-syncer-container image to be consistent with ART
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants