Add "DaemonSet create-first rolling update" design proposal. #977

diegs · 2017-08-24T01:09:21Z

This document captures the results of the discussion here:

Relevant feature:

cc @kow3ns @erictune @roberthbailey @janetkuo @luxas @lpabon @aaronlevy @kubernetes/sig-cluster-lifecycle-feature-requests @kubernetes/sig-apps-feature-requests

This document captures the results of the discussion here: kubernetes/kubernetes#48841 Relevant feature: kubernetes/enhancements#373

k8s-ci-robot · 2017-08-24T01:09:30Z

@diegs: Reiterating the mentions to trigger a notification:
@kubernetes/sig-cluster-lifecycle-feature-requests, @kubernetes/sig-apps-feature-requests.

In response to this:

This document captures the results of the discussion here:

kubernetes/kubernetes#48841

Relevant feature:

kubernetes/enhancements#373

cc @kow3ns @erictune @roberthbailey @janetkuo @luxas @lpabon @aaronlevy @kubernetes/sig-cluster-lifecycle-feature-requests @kubernetes/sig-apps-feature-requests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

liggitt · 2017-08-24T03:23:18Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+then a rolling update of the controller-manager would delete all the old pods
+before the new pods were scheduled, leading to a control plane outage.
+
+This feature is a blocker for supporting self-hosted Kubernetes in `kubeadm`.


is it a blocker with a HA master node setup with maxUnavailable set to 1?

This doesn't affect HA deployments given all components talk to a loadbalanced endpoint of any kind

@diegs We could point out that this is for the one-master case only, but that it is a generally useful feature, not a kubeadm specific thing.

liggitt · 2017-08-24T03:46:12Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+### Considerations / questions
+
+1. How are `hostPort`s handled? 


aren't hostPorts likely to be common in self-hosted scenarios (one of the driving use cases for this proposal)?

This proposal is simply incompatible with hostPorts, barring some really clever work that isn't on any roadmap.

@liggitt In kubeadm's case; no. We're using DaemonSets with hostNetworking (due to an other scheduling bug)

We're using DaemonSets with hostNetworking

if you are using host networking, won't your new containers still be unable to bind to the same ports (since they'd still be held by the old container), and never go healthy, and fail upgrade?

SO_REUSEPORT ?

What I alluded to with "really clever work" would be to emulate that in the port-forwarding path, but that sounds disastrous.

I would argue that special cases for hostPorts, hostNetworking, and other scarce resources like GPUs should not be a goal here. In particular, the behavior you describe is also true of Deployments.

For example (and I tested this), create a deployment where replicas = worker nodes, and use a hostport. Set maxUnavailable to 0, and then update your deployment. The new pods will be stuck in pending since the scheduler can't place the new pods anywhere.

This behavior is also true for DaemonSets as you point out (updated in the doc). And yes, I can see how DaemonSets may more often use hostPorts / hostNetworking in practice, but I think the scheduler's job is to place pods subject to the intersection of their resource constraints, not do resource planning for you.

We should absolutely call out this risk in the documentation. But I don't think that it should be this code's job to try to foresee potential resource conflicts.

I basically agree. I don't think it is tractable to try to fix this special case.

This should be documented and highlighted.m The documentation can point the user to use the "delete before add" model.

Also, it is possible to test for a pod having a hostPath and refusing to do a "add before delete" option? (updated: it is noted in the document on line 136)

I will update the documentation once we come to a decision about what (if any) constraints will be checked.

I think the two proposed approaches still in contention are:

Add very visible documentation saying that certain classes of resource types (such as host networking, host ports, etc) may lead to unschedulable situations when using this strategy.

Add documentation as in 1, but also do a hard validation check to disallow using this strategy in conjunction with hostPorts specifically.

I am becoming convinced that option 2 is somewhat reasonable since the DaemonSet controller has an explicit scheduling check for hostPort contention. But I would not be in favor of adding checks for any other resource types, at least with the initial implementation.

lukaszo · 2017-08-24T09:37:59Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+1. How are `hostPort`s handled? 
+
+They are not handled as part of this proposal. We can either:


Allowing non working configuration is a bad idea. I think option 3 should be implemented.

As I mentioned above, you can reach this state with Deployments too. I'll defer on picking an answer to this question until we resolve that discussion.

luxas · 2017-08-24T11:36:11Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+then a rolling update of the controller-manager would delete all the old pods
+before the new pods were scheduled, leading to a control plane outage.
+
+This feature is a blocker for supporting self-hosted Kubernetes in `kubeadm`.


@diegs We could point out that this is for the one-master case only, but that it is a generally useful feature, not a kubeadm specific thing.

luxas · 2017-08-24T11:38:00Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+### Alternatives considered
+
+The `maxSurge` parameter could have been added to the existing `RollingUpdate`
+strategy (as is the case for Deployments). However, this would break backwards


break validation backwards compability

Please clarify that while all this would have been technically possible due to the shift to v1beta2; it was discarded just to be on the safe side.

luxas · 2017-08-24T11:39:09Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+### Considerations / questions
+
+1. How are `hostPort`s handled? 


@liggitt In kubeadm's case; no. We're using DaemonSets with hostNetworking (due to an other scheduling bug)

luxas · 2017-08-24T11:39:49Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+They are not handled as part of this proposal. We can either:
+
+ 1. Attempt to determine a mechanism to handle this (unclear)


point out that this may be a future improvement, but will not be taken into consideration right now.
Possibly for GA?

luxas · 2017-08-24T11:40:15Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+ 2. Note in the documentation that this may cause port conflicts that prevent
+ new pods from scheduling successfully (and therefore updates from completing
+ successfully)
+ 3. Add a validation check that rejects the combination of `hostPort` and the


👍 please point out as the accepted proposal

host ports are only one possible scarce resource (albeit the most likely). validating just that still allows for impossible-to-satisfy updates involving scarce resources, which is worse for daemonsets than for deployments because there is no possibility of finding another node with the available resources.

As far as I remember DaemonSet controller checks only for HostPorts conflicts. It will not allow new pod to run on a node and it will cause dead lock.

luxas · 2017-08-24T11:41:18Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+TODO(diegs): investigate this.
+
+3. How are other shared resources (e.g. local devices, GPUs, specific core


How do deployments handle this?

they rely on the scheduler to locate nodes with available resources

Agreed, and if you think of DaemonSets as a special case of a Deployment with numReplicas = numNodes and antiaffinity then the scheduling will degenerate in the same way if you are using scarce resources.

It would require a lot of special-case engineering to work around all possible issues, and I'm sure there will always be more.

kow3ns · 2017-08-24T13:56:25Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+pods first, cleaning up the old pods once the new ones are running. As with
+Deployments, users will use a `maxSurge` parameter to control how many nodes
+are allowed to have >1 pod running during a rolling update.
+


Please add the modifications that you intend to make to the API. Generally, we add the types and type modifications that we intend to make as part of the design. The design docs for StatefulSet and DaemonSet RollingUpdate are examples.

kow3ns · 2017-08-24T13:58:39Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+`maxUnavailable` to be > 0, but this is not required with
+`SurgingRollingUpdate`.
+
+There are plans to create a `v1beta2` iteration of the extensions API, so it


The v1beta2 API is enabled by default on master. We want to promote that surface to GA in 1.9 provided we get good feedback on the v1beta2 surface in 1.8.

kow3ns · 2017-08-24T14:00:19Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+may be possible to add the `maxSurge` strategy to `RollingUpdate` and deprecate
+`SurgingRollingUpdate` then.
+
+For more background see https:/kubernetes/kubernetes/issues/48841.


Why is your proposal a better alternative to creating a new DaemonSet and only deleting the current DaemonSet after the first has saturated the desired number of Nodes? Can you discuss this more here?

Can you elaborate on this strategy? Do you mean having the user create a new DaemonSet with a different name, and then deleting this one?

If so I can think of many reasons (retaining history, minimizing clutter, actually being able to introspect the status of the update, etc.) but I want to make sure I understand which you mean.

Agree. Multiple DaemonSets should be listed as an alternative here, too.

kow3ns · 2017-08-24T14:03:26Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+### Considerations / questions
+
+1. How are `hostPort`s handled? 
+


I don't think its acceptable to completely punt on hostPort conflicts. They are the most common way of communicating with applications in a DaemonSet. Using validation to block the creation of DaemonSets with this updateStrategy (as you suggest below) is one option, but it really limits the usefulness of the updateStrategy.

Agree, hostPort conflicts need to be handled for SurgingRollingUpdate to be useful.

@janetkuo handled in what sense? Can you comment on the discussion immediately above this one? We could block it in validation (since I guess the code already has a special case to block it from a scheduling perspective), but besides that I'm not sure what else we can do.

kow3ns · 2017-08-24T14:04:59Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+ 3. Add a validation check that rejects the combination of `hostPort` and the
+ `SurgingRollingUpdate` strategy.
+
+2. How will the scheduler handle hostPort collisions?


Unless application designers implement clever coordination, it will attempt to schedule both Pods, and one of them will continuously fail.

It will not schedule the new pod. https:/kubernetes/kubernetes/blob/master/pkg/controller/daemon/daemon_controller.go#L1253

Is that true for the default scheduler as well (Assuming me move to it one day)?

I think that for the default scheduler pod will be created and will be in the Pending state

That is the behavior I observed.

kow3ns · 2017-08-24T14:21:06Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+`RollingUpdate` strategy only deletes pods up to the `maxUnavailable` threshold
+and then waits for the main reconciler create new pods to replace them).
+
+One minor complication is that the controller-manager's reconciliation loop


This isn't actually true. One Pod per node is best effort. In contrast to this proposal, which would relax this constraint, the community has requested exactly one Pod per node to provide better guarantees with respect to the overlap of lifecycles of the Pods in their DeamonSets. Right now it is possible for DaemonSet to schedule a new Pod while another Pod is still terminating. We consider this to be a bug (kubernetes/kubernetes#50477).

Clarified that the invariant is not 100% enforced yet but it is a goal.

This strategy is incompatible with the invariant, so either we need to relax it during updates as mentioned, or not implement this.

That bug is unexpected and was recently introduced by kubernetes/kubernetes#43077 (1.8). This behavior (don't create new pods until the old one is terminated) has been supported by DaemonSet controller for a long time.

kow3ns · 2017-08-24T14:21:39Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+3. How are other shared resources (e.g. local devices, GPUs, specific core
+ types) handled?
+
+This is a generalization of the `hostPort` question. I think that in this case


I think that consuming GPU and specific core types can be considered out of scope. It is not a common use case for DaemonSets. For hostPath volumes, you could perhaps advise using filesystem locks to prevent data corruption when applications use this update strategy. If you combine filesystem locks, a burden on application implementors (at least initially), with a robust solution to hostPort conflicts, you have something that most applications can consume.

kow3ns · 2017-08-24T14:30:23Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+This feature is a blocker for supporting self-hosted Kubernetes in `kubeadm`.
+
+### Implementation plan


There are a few things which aren't clear to me.

When can I select this strategy? Does it have to be set on creation? Can I change the updateStrategy from RollingUpdate to SurgeRollingUpdate (currently updateStrategy is mutable)?

How will the controller handle the case where the updateStrategy is mutated to SurgeRollingUpdate when RollingUpdate was previously selected and an update is in progress.

How will the controller handle the case where the updateStrategy is mutated to RollingUpdate and SurgeRollingUpdate was previously selected and an update is in progress.

Is this on or off by default? Do you intend to use feature gate flags make this an alpha feature, or will all clusters immediately get this feature?

Added answers in the "Considerations / questions" section.

Regarding alpha/beta/feature gate, I defer to @luxas and others.

liggitt · 2017-08-24T18:00:46Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+`maxUnavailable` to be > 0, but this is not required with
+`SurgingRollingUpdate`.
+
+There are plans to create a `v1beta2` iteration of the extensions API, so it


There are plans to create a v1beta2 iteration of the extensions API

do you mean v1beta2 of the apps API group? The goal is to eliminate the extensions API group, not perpetuate it

Yes it should be apps/v1beta2. Since apps/v1beta2 is introduced in 1.8, this validation change can be made in apps/v1beta2 only if it's implemented in 1.8.

diegs

Addressed comments and updated doc.

I used a new commit so we wouldn't lose the comment threads. I can squash before I submit.

diegs · 2017-08-25T18:22:32Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+pods first, cleaning up the old pods once the new ones are running. As with
+Deployments, users will use a `maxSurge` parameter to control how many nodes
+are allowed to have >1 pod running during a rolling update.
+


diegs · 2017-08-25T18:23:36Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+then a rolling update of the controller-manager would delete all the old pods
+before the new pods were scheduled, leading to a control plane outage.
+
+This feature is a blocker for supporting self-hosted Kubernetes in `kubeadm`.


diegs · 2017-08-25T18:28:31Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+This feature is a blocker for supporting self-hosted Kubernetes in `kubeadm`.
+
+### Implementation plan


Added answers in the "Considerations / questions" section.

diegs · 2017-08-25T18:29:12Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+This feature is a blocker for supporting self-hosted Kubernetes in `kubeadm`.
+
+### Implementation plan


Regarding alpha/beta/feature gate, I defer to @luxas and others.

diegs · 2017-08-25T18:31:29Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+`RollingUpdate` strategy only deletes pods up to the `maxUnavailable` threshold
+and then waits for the main reconciler create new pods to replace them).
+
+One minor complication is that the controller-manager's reconciliation loop


Clarified that the invariant is not 100% enforced yet but it is a goal.

This strategy is incompatible with the invariant, so either we need to relax it during updates as mentioned, or not implement this.

diegs · 2017-08-25T18:40:09Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+### Considerations / questions
+
+1. How are `hostPort`s handled? 


I would argue that special cases for hostPorts, hostNetworking, and other scarce resources like GPUs should not be a goal here. In particular, the behavior you describe is also true of Deployments.

For example (and I tested this), create a deployment where replicas = worker nodes, and use a hostport. Set maxUnavailable to 0, and then update your deployment. The new pods will be stuck in pending since the scheduler can't place the new pods anywhere.

This behavior is also true for DaemonSets as you point out (updated in the doc). And yes, I can see how DaemonSets may more often use hostPorts / hostNetworking in practice, but I think the scheduler's job is to place pods subject to the intersection of their resource constraints, not do resource planning for you.

We should absolutely call out this risk in the documentation. But I don't think that it should be this code's job to try to foresee potential resource conflicts.

diegs · 2017-08-25T18:42:58Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+### Considerations / questions
+
+1. How are `hostPort`s handled? 
+


diegs · 2017-08-25T18:43:33Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+1. How are `hostPort`s handled? 
+
+They are not handled as part of this proposal. We can either:


As I mentioned above, you can reach this state with Deployments too. I'll defer on picking an answer to this question until we resolve that discussion.

diegs · 2017-08-25T18:47:27Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+ 3. Add a validation check that rejects the combination of `hostPort` and the
+ `SurgingRollingUpdate` strategy.
+
+2. How will the scheduler handle hostPort collisions?


That is the behavior I observed.

diegs · 2017-08-25T18:49:48Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+TODO(diegs): investigate this.
+
+3. How are other shared resources (e.g. local devices, GPUs, specific core


Agreed, and if you think of DaemonSets as a special case of a Deployment with numReplicas = numNodes and antiaffinity then the scheduling will degenerate in the same way if you are using scarce resources.

It would require a lot of special-case engineering to work around all possible issues, and I'm sure there will always be more.

xiang90 · 2017-08-25T21:04:57Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+ //---
+ // TODO: Update this to follow our convention for oneOf, whatever we decide it
+ // to be. Same as Deployment `strategy.rollingUpdate`.
+ // See https:/kubernetes/kubernetes/issues/35345


fix indentation?

janetkuo · 2017-08-24T18:16:56Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+strategy.
+
+This document presents a design for a new "add first, then delete" rolling
+update strategy, called `SurgingRollingUpdate`. This strategy will create new


If this new DaemonSet update strategy is similar to Deployment's RollingUpdate (i.e. MaxSurge & MaxUnavailability), it can support both "delete first" and "add first" (depending on whether MaxUnavailability is 0 or not).

This is noted in Alternatives Considered below. It was decided in kubernetes/kubernetes#48841 to add a new strategy so validation would remain backwards-compatible.

I think its a great idea to converge on strategies across all controllers. We could add MaxSurge and MaxUnavailability and keep the default as MaxSurge=0 and MaxUnavailability as 1 to keep it backward compatible. @diegs how does this break validation ?

janetkuo · 2017-08-25T23:38:37Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+2. Determine how many previous-generation pods can be retired once their
+ replacement pods are scheduled and running.
+3. Instruct the controller-manager to add and delete pods based on the results
+ of (1) and (2).


SurgingRollingUpdate will create a new DaemonSet pod being created before the old one gets terminated on the same node, correct?

For example, let's say you have 3 nodes, maxSurge=1, maxUavailable=0. Node 1 has 2 pods, and each of the other 2 nodes has 1 pod (4 pods in total). Then you can only delete pods on node 1, but not the other two.

If this is the requirement, this is not just about pods anymore, you need to mention nodes here.

Correct. Clarified.

janetkuo · 2017-08-25T23:47:38Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+`RollingUpdate` strategy only deletes pods up to the `maxUnavailable` threshold
+and then waits for the main reconciler create new pods to replace them).
+
+One minor complication is that the controller-manager's reconciliation loop


That bug is unexpected and was recently introduced by kubernetes/kubernetes#43077 (1.8). This behavior (don't create new pods until the old one is terminated) has been supported by DaemonSet controller for a long time.

janetkuo · 2017-08-25T23:49:30Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+`maxUnavailable` to be > 0, but this is not required with
+`SurgingRollingUpdate`.
+
+There are plans to create a `v1beta2` iteration of the extensions API, so it


Yes it should be apps/v1beta2. Since apps/v1beta2 is introduced in 1.8, this validation change can be made in apps/v1beta2 only if it's implemented in 1.8.

janetkuo · 2017-08-25T23:51:54Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+may be possible to add the `maxSurge` strategy to `RollingUpdate` and deprecate
+`SurgingRollingUpdate` then.
+
+For more background see https:/kubernetes/kubernetes/issues/48841.


Agree. Multiple DaemonSets should be listed as an alternative here, too.

janetkuo · 2017-08-25T23:53:31Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+strategy (as is the case for Deployments). However, this would break backwards
+compatibility, since the `maxUnavailable` parameter currently requires
+`maxUnavailable` to be > 0, but this is not required with
+`SurgingRollingUpdate`.


Also, most DaemonSet applications can't tolerate two pods running at the same time. That's why RollingUpdate is implemented that way (delete first, then create).

janetkuo · 2017-08-25T23:57:22Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+### Considerations / questions
+
+1. How are `hostPort`s handled? 
+


Agree, hostPort conflicts need to be handled for SurgingRollingUpdate to be useful.

lpabon · 2017-08-28T19:35:58Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+This document presents a design for a new "add first, then delete" rolling
+update strategy, called `SurgingRollingUpdate`. This strategy will create new
+pods first, cleaning up the old pods once the new ones are running. As with
+Deployments, users will use a `maxSurge` parameter to control how many nodes


What happens if maxSuge is zero? Would that interrupt the algorithm in this paper? Also what is the default value for maxSuge?

The default is 1, and it must be greater than 0. Updated the doc to say as much.

lpabon · 2017-08-28T19:42:09Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+### Considerations / questions
+
+1. How are `hostPort`s handled? 


This should be documented and highlighted.m The documentation can point the user to use the "delete before add" model.

Also, it is possible to test for a pod having a hostPath and refusing to do a "add before delete" option? (updated: it is noted in the document on line 136)

lpabon · 2017-08-28T19:42:48Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+ 2. Note in the documentation that this may cause port conflicts that prevent
+ new pods from scheduling successfully (and therefore updates from
+ completing successfully)
+ 3. Add a validation check that rejects the combination of `hostPort` and the


diegs

@janetkuo @liggitt any chance we can close this out so we can move back over to the PR?

I think the main open questions are:

Whether we add special validation for hostPorts
Whether this is feature gated somehow, or classified as alpha / beta

Thanks,
Diego

diegs · 2017-08-28T22:54:00Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+This document presents a design for a new "add first, then delete" rolling
+update strategy, called `SurgingRollingUpdate`. This strategy will create new
+pods first, cleaning up the old pods once the new ones are running. As with
+Deployments, users will use a `maxSurge` parameter to control how many nodes


The default is 1, and it must be greater than 0. Updated the doc to say as much.

diegs · 2017-08-28T22:57:07Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+### Considerations / questions
+
+1. How are `hostPort`s handled? 


I will update the documentation once we come to a decision about what (if any) constraints will be checked.

I think the two proposed approaches still in contention are:

Add very visible documentation saying that certain classes of resource types (such as host networking, host ports, etc) may lead to unschedulable situations when using this strategy.

Add documentation as in 1, but also do a hard validation check to disallow using this strategy in conjunction with hostPorts specifically.

I am becoming convinced that option 2 is somewhat reasonable since the DaemonSet controller has an explicit scheduling check for hostPort contention. But I would not be in favor of adding checks for any other resource types, at least with the initial implementation.

smarterclayton · 2017-10-12T19:25:05Z

To expand a bit: every pod has a phase change on deletion:

running
graceful termination
terminated
removed from etcd

ReplicaSets create pods at 2 always because they are replicas (we provide "at-least 1" semantics). StatefulSets create pods at step 4 because they are strongly tied to pod identity (we provide "at-most 1" semantics). DaemonSets are in the middle - they are not replicas, but they are not pod identity. Instead, they are tied to node-identity. The real decision that a user has to make is whether they can tolerate at-most-1 or at-least-1. The issue triggered by kubernetes/kubernetes#50477 was a user who cared about at-most-1. This issue was triggered by a user who needs at-least-1. Both are valid use cases.

The behavior in a distributed system that is required to guarantee at-most-1 is pretty aggressive - I don't think we currently guarantee that with daemonsets (for instance, just run two controller managers at the same time and they will race to create multiple pods on the node). So the user in 50477 is almost certainly in for a rude awakening at some point when they depend on at most one and we don't provide it. In a sense, that's a violation of #124 and we should fix that as soon as we can.

I think we probably need to either commit to giving the users a choice, or to EXPLICITLY always be at-least-1 so that users aren't surprised when we pretend to be at-most-1 but then fail to do so. Rule of thumb is that pretending to be at-most-1 is the worst thing you can do - there is no "try".

The upgrade strategy is subordinate to this choice - a rolling update strategy has to respect at most 1 or at least 1, but is probably free to do the right thing as long as the guarantee isn't violated. Any workload that supports at-least-1 probably wants the surge behavior on rolling update, no matter what.

jbeda · 2017-10-12T19:36:16Z

I'm not totally read in to this stuff but the at-most-1 vs. at-least-1 seems to be a useful way to break this down.

For at-least-1 we'd want to have some patterns well documented on how to do a graceful handoff from the old thing to the new thing. That could be just a graceful shutdown or it could also be a migration of data to the new thing to keep state around and alive. This is a bit of a "live migration" type of situation.

diegs · 2017-10-12T19:48:13Z

Yes, I was under the assumption that no part of k8s guaranteed at-most-1, and was surprised by kubernetes/kubernetes#50477. I completely agree that it will probably be a difficult constraint to truly guarantee in perpetuity.

I also agree that it's orthogonal to the strategy; maxUnavailable > 0 with maxSurge = 0 will terminate pods before scheduling new ones, and maxUnavailable = 0 with maxSurge > 0 will to schedule new pods before terminating old ones; whether there are actually 0, 1, or 2 pods running in either scenario is not guaranteed.

Going back to this proposal, the maxSurge parameter in the strategy as I implemented it was not affected by kubernetes/kubernetes#50477. But it goes back to what I said about preferring explicitly implementing the semantics we want instead of relying on implementation details that may change. In the surging strategy we will always attempt to schedule a new pod before setting the old one to terminating, so that gives a higher chance of success than waiting for the old pod to be in a terminating state before scheduling the new one, as is the behavior with the existing strategy.

EDIT to clarify my last point above:

With the existing strategy the scheduler first sets a pod to terminating. Then it schedules a new pod on that node. Even with a long timeout, something could happen between those two steps in which the old pod terminates. If that pod was the only controller manager then there is nothing left to schedule the new pod.

In the new strategy the new pod would be scheduled before setting the old one to terminating. Even if the new pod doesn't come up before the old one completely terminates, it should eventually (it comes down to the node at that point). It's not a guarantee but it removes a major point of failure.

smarterclayton · 2017-10-12T20:04:14Z

If we decide that daemonsets should be at-least-1, then there is no real need for the existing update strategy to be doing what it is doing today. In which case surge should be part of the existing strategy like ReplicaSets and we don't need or want a new strategy, we just want rolling update to do the right thing.

If we are going to support at-most-1 we need to decide on that before we go to v1.

I would argue anyone can implement at-most-one on a node by using a shared filesystem lock. While that requires a shared hostPath volume, it's a common daemonset use case. Therefore, that's an argument that we should be at-least-one because we support a workaround.

What's clear is that this discussion blocks taking daemonset to GA. We need to follow up and ensure we reach closure on something this fundamental before going GA.

diegs · 2017-10-12T20:10:26Z

@smarterclayton agree that this could be implemented by adding maxSurge to the existing strategy; it was not to retain backwards-compatibility for validation.

The underlying motivation here was to achieve parity with the Deployment rolling update strategy, which has both params.

smarterclayton · 2017-10-12T20:21:02Z

Two clarifications:

Stateful sets guarantee at most one pod per identity and do so safely. We guarantee at most one pod with a given name is running on the cluster at any one time.
Safety rules and ensuring user data is safe trumps backwards compatibility. If we determine our current semantics are confusing users that they get at most 1, it's within our remit to break users in order to make them safe.

The best thing now is for interested parties to make a case for either at-most-1 as a feature or as a supported workaround as Joe noted. That will clarify next steps - reverting / altering 50477, an API change to DS to let a choice be possible, or delaying DS GA until we can make at-most-1 correct.

janetkuo · 2017-10-12T21:30:55Z

FWIW kubernetes/kubernetes#50477 is a regression (caused by kubernetes/kubernetes#43077). DaemonSet controller had at-most-1 logic implemented long before kubernetes/kubernetes#50477. DaemonSet controller will not try to create DaemonSet pod on a node until it finds all the previous ones are gone.

StatefulSet uses index as part of pod name to guarantee at-most-1 pod per index. A possible way to guarantee at-most-1 pod per node for DaemonSet is to use node name as part of DaemonSet pod's name, but the pod name might become too long.

kow3ns · 2017-10-13T00:11:16Z

I don't think this decision should gate v1. DaemonSet controller was always designed to attempt at most one semantics, but I believe we could loosen or tighten this constraint as a feature, post GA, without breaking backwards compatibility.

For at most one:

I'm not entirely convinced that trying to tighten the constraint (e.g. by incorporating the Node name in the generated Pod name) would be a breaking change if we turned this on in the general case.
1. The general expectation is that the controller is already trying to achieve at most one.
2. Name generation for DaemonSet Pods is random, so if we use the Node name to make the Pods Node unique it doesn't violate users' expectations (for DaemonSets the name of a Pod is an implementation detail, not a conformance requirement).
If we think it is a breaking change, than we could make it opt in via the addition of a flag on the DaemonSet Spec (e.g. spec.nameGeneration=Random|Node) where the default uses random name generation.
As pointed out above, applications can use advisory file system locks for tighter guarantees of uniqueness to achieve stricter semantics where necessary, so I don't think we have to provide stronger guarantees as a prerequisite for promotion.

For at least once:

We can modify the update strategy to use maxSurge, or add a new UpdateStrategy. In either case the new behavior can be implemented without breaking backward compatibility.

Also, We should not revert #50477. This reverted a change that introduced a bug where DaemonSet controller would launch Pods while a current Pod was still terminating. Reverting this will not implement maxSurge behavior.

smarterclayton · 2017-10-13T04:12:45Z

DaemonSet controller was always designed to attempt at most one semantics, but I believe we could loosen or tighten this constraint as a feature, post GA, without breaking backwards compatibility.

Looking at the code we'd need to tighten at least a few things to guarantee that (as you mentioned with node name). But I am a -1 on declaring something GA that has even the potential to violate a user assumption or violate pod safety guarantees, and that's where we are today. My working assumption was that daemonset is at-least-one because we are using generateName, and you can't use generateName and get at-most-one. The statements that it's at-most-one is surprising to me.

We can modify the update strategy to use maxSurge, or add a new UpdateStrategy

at-most-1 or at-least-1 is not about update strategy. It's a spec property independent of update. In the absence of an update during a drain or eviction the daemonset should follow the same property. It's fundamental to daemonset, the same way pod management policy applies regardless of update strategy.

Also, We should not revert #50477

Agree, but not-quite at-most-1 is also not the right thing to ship. By revert I meant, "don't almost give at-most-1 as the user who requested #50477 raised".

The general expectation is that the controller is already trying to achieve at most one.

I think cluster-lifecycle's position is that it's actually not what they need, and other issues (cursory search includes kubernetes/kubernetes#51974 and kubernetes/kubernetes#48307) makes it less than clear. Looking at my own use cases, I have a small number that want at-most-one and a much larger number that want at-least-one for rolling update.

As pointed out above, applications can use advisory file system locks for tighter guarantees of uniqueness to achieve stricter semantics where necessary, so I don't think we have to provide stronger guarantees as a prerequisite for promotion.

I don't think this is the right mindset for providing an API to users based on our previous history here on statefulsets and deployments. Providing an ALMOST correct guarantee creates a scenario where users expect it to work except when it doesn't. That means it fails at the worst time and either eats data or blows up a system. Either we implement at-most-1, or we implement at-least-1 and tell people how to get at-most-1. We've never shied away from doing the right thing for workload APIs before, not sure why we should do it here.

I'd probably be ok with a v1 that did at-least-1 that could be turned to almost-at-most-1 with caveats and warnings, where in the future we'd implement correct at-most-1. I can't reconcile shipping almost-at-most-1 only given #124 and history with statefulset.

bgrant0607 · 2017-10-13T14:11:04Z

Quick comment: Any application that really needs at most one on a node (e.g., because multiple instances would otherwise trash persistent data on local disk) absolutely has to use a file lock or other OS locking mechanism. We should document that. K8s cannot and should not provide application-level exclusion/handshaking.

kow3ns · 2017-10-13T16:38:25Z

I'd probably be ok with a v1 that did at-least-1 that could be turned to almost-at-most-1 with caveats and warnings, where in the future we'd implement correct at-most-1. I can't reconcile shipping almost-at-most-1 only given #124 and history with statefulset.

What almost at most one means right now is that DaemonSet respects the termination grace period of the Pods that it created. Currently the DaemonSet controller will not launch a new Pod until any other Pod, that it has created on the same node, is completely deleted. A failure to correctly implement this was the root cause of #50477.
As @smarterclayton points out above, this behavior is important for DaemonSets that use scarce resources as a new Pod may interfere with the termination lifecycle of the current Pod, and this can occur during unplanned termination (e.g. eviction due to taints). I don't think that it ever makes sense to modify this behavior for DaemonSet Pods during unplanned disruptions.
For this proposal, and for #51974 (#48307 is about the interaction between Deployments and PDBs and I don't see why it's relevant here), the best way to deal with port conflicts would be to implement something like Node Local Services first. If we don't do this, I don't see how any modification to the existing strategy, or any new strategy that attempts to launch more than one Pod per Node, will provide value in the general case (i.e. It might work for self hosting, but it will not solve #51974).
It sounds like we have broad agreement that advisory filesystem locking, or some other locking method, must be used to achieve strong guarantees about at most one semantics. If we want to give users the option of having DaemonSet controller generate a Node unique name to preclude the controller from creating two Pods on the same Node, we can do so using an addition to the spec as discussed above, but if at most one is critical, implementing that assurance via local, application level coordination will always provide a more robust solution.
The DaemonSet feature was opened in Sept 2014, though it has evolved, its general design constraints haven't changed significantly since then. Given the number of addons that are currently using DaemonSet, and that we are even considering what needs to be done for self hosting, I think we have a strong signal, from our own usage, that the current implementation is mature enough for promotion. We can do (3), (4), or (3) and (4) after GA promotion without breaking backward compatibility.

smarterclayton · 2017-10-14T01:52:02Z

What almost at most one means right now is that DaemonSet respects the termination grace period of the Pods that it created

This is not at-most-one (to be clear for other participants in this thread) because any interruption or race in the controller can and will spawn two containers. This is "best effort at most one".

this behavior is important for DaemonSets that use scarce resources as a new Pod may interfere with the termination lifecycle of the current Pod, and this can occur during unplanned termination (e.g. eviction due to taints

I don't follow. Nodes don't release resources until a pod is stopped, so a new pod launched by the daemonset to replace an evicted pod will be held up by resource release in the kubelet (or should, since this is a condition only the node can validate).

#48307 specifically comments from a user:

Another thing is a planned maintenance on a node: it should be fairly simple to make sure extra pods are started elsewhere before we shut down this node for maintenance

There is a reasonable expectation among a wide range of kubernetes users I have interacted with that kubernetes should work to avoid application downtime by being aggressive about scale up. Users have the same expectation with daemon sets.

Port conflicts ... If we don't do this, I don't see how any modification to the existing strategy, or any new strategy that attempts to launch more than one Pod per Node, will provide value in the general case

Port conflicts are irrelevant since they can be avoided by the user omitting the port definition on the pod. A user today can remove the port conflict and do port handoff themselves with SO_REUSEPORT, but the current behavior of the daemonset prevents that. I don't know of any other conflict that impacts normal users, and omitting the pod's port definition is not burdensome to users if they can achieve availability. So the blocker is still the daemonset controller imposing at-most-one-ish.

At most one through locking ... more robust

The point though is that giving the appearance of a robust solution deters users from being actually robust which leads to application failure. We have documented and agreed on not violating user expectation previously.

Given the number of addons that are currently using DaemonSet, and that we are even considering what needs to be done for self hosting...

Since the self-hosting folks are saying there's a gap, this seems a weak argument. V1 is EXACTLY when we should be confident it works. A perceived pressure to declare workload APIs v1 means very little to me if the semantics will lead to failure. I don't care whether workload v1 hits in 1.9 or 1.10, I care very much whether my production users are justified in their trust that we do the right thing for them.

kow3ns · 2017-10-14T20:53:33Z

This is not at-most-one (to be clear for other participants in this thread) because any interruption or race in the controller can and will spawn two containers. This is "best effort at most one".

Correct, to be clear, as stated above, its just about respecting termination grace.

I don't follow. Nodes don't release resources until a pod is stopped, so a new pod launched by the daemonset to replace an evicted pod will be held up by resource release in the kubelet (or should, since this is a condition only the node can validate).

If we create another Pod on the same node while the current Pod is modifying some host resource, say the file system, as part of its termination, while the new Pod is attempting to utilize that resource it is surprising to users as in #50477.

Port conflicts are irrelevant since they can be avoided by the user omitting the port definition on the pod. A user today can remove the port conflict and do port handoff themselves with SO_REUSEPORT,

The issue is that when users run a DaemonSet that provides a shared resource on the node that is utilized by many Pods (e.g. log shipping, distributed file system, metrics collection), the current recommend way of communicating with the Pod is via a well known host port. Many applications (e.g. HDFS ) don't allow the user to configure SO_REUSEPORT. This means that updating the Pods in the DaemonSet is inherently disruptive to all of its client Pods. Local services would provide a layer of abstraction that allows a surge update strategy to be applied in such a way that it is not disruptive to the node local client Pods.

The point though is that giving the appearance of a robust solution deters users from being actually robust which leads to application failure. We have documented and agreed on not violating user expectation previously.

Agree that we need to clearly document the behavior and limitations, and if that documentation is presently lacking, no matter what else we do, we need to correct it for users of the current implementation, but I don't see how promoting what is currently implemented, documented, and in use on production clusters can violate the users expectations.

V1 is EXACTLY when we should be confident it works.

I strongly agree with this. Which is why I think we should not modify the existing semantics that have been in place since the original design, or ship something drastically different than the DaemonSet that we use for our addons and that users have come to rely on in their production systems.

Since the self-hosting folks are saying there's a gap, this seems a weak argument.

After talking to the self hosting folks, they indicated they can achieve their objective by creating a new DaemonSet prior to deleting the previous one. This is the way that many users achieve blue-green for DaemonSets and StatefulSets today. Given that those users have a path forward, and that we can add this feature later to further improve their experience, I think we should prioritize providing our users, and our own cluster addons, stability over new features.

smarterclayton · 2017-10-16T21:43:13Z

If we create another Pod on the same node while the current Pod is modifying some host resource, say the file system, as part of its termination, while the new Pod is attempting to utilize that resource it is surprising to users as in #50477.

I agree that user surprise should be avoided, and I might even agree that the correct default for DS is at-most-one. On the other hand, we often slap people in Kubernetes hard the first time they use a particular controller specifically so that they know what they can rely on (the hard stop on graceful termination, the crash loop, the liveness check, replica scale out) up front without finding out later. And still, the number of users that assumed that replica set with scale = 1 means "at most one" is terrifying, but all of them were at least guided to the correct knowledge when their databases didn't scale up because the old pod was tearing down, or they saw two pods taking traffic.

If we had a really clear path to at-most-one correctly in the near term that wouldn't break backward compatibility and we were positive wouldn't require API changes to achieve, I'm less worried. I'd like to spend our time trying to figure out how that can be provided.

Local services would provide a layer of abstraction that allows a surge update strategy to be applied in such a way that it is not disruptive to the node local client Pods.

Yeah, I'm just mostly thinking of things using host port networking or are the ones providing services (i.e. kube-proxy, dns, sdn) that can't rely on that.

I don't see how promoting what is currently implemented, documented, and in use on production clusters can violate the users expectations.

If the pod watch is delayed and a resync is fired, it looks like the daemonset controller can create two pods on a node (from a cursory look and previous controller experience). Is that incorrect? Delayed pod watches have occurred in the last several etcd versions due to various bugs, and we have stomped several bugs in caches where they can also occur (beyond just normal distributed systems). If I have a pod that I have written that assumes that only one pod runs on a node at a time, nothing in Kube today pushes me to use filesystem locking because I'm trusting kube.

indicated they can achieve their objective by creating a new DaemonSet prior to deleting the previous one

I'm fine with that. I'm still concerned about the previous point.

smarterclayton · 2017-10-17T07:47:33Z

It looks like all that protects the daemonset from creating multiple pods on a node is the expectation system, which is single process based.

new node is added at RV=9
controller creates pod at RV=10
controller crashes
new controller pod cache syncs to pods at RV=9 (this happens all the time for multiple reasons in the api server)
controller assumes pod caches are up to date, has no expectations, creates a second pod at RV=11

mattfarina · 2017-10-17T14:21:37Z

@smarterclayton we discussed this in SIG Apps. The advice at this point was to take it to SIG Architecture in office hours this week. Would be nice to get some unblocking decisions there.

smarterclayton · 2017-10-17T21:50:18Z

Spawned the last part (daemonset not being resistant to controller restart and slow caches) as kubernetes/kubernetes#54096 and summarized the discussion here.

Otherwise, this is on hold for 1.9 given sig-cluster-lifecycle having an alternative (correct?).

diegs · 2017-10-17T22:00:37Z

@smarterclayton yes, we discussed in sig-cluster-lifecycle today and the plan is to move forward with an alternative.

Personally, I'm in agreement that coming to a conclusion on the supported semantics of daemonsets should precede this discussion; once that is done then it would be nice to see if there is room for a maxSurge parameter in the rolling update strategy. For example, if maxUnavailable > 0 && maxSurge == 0 then you get at-most-one semantics, if maxUnavailable == 0 && maxSurge > 0 you get at-least-one semantics.

Once that is settled other issues brought up here (scarce resources such as hostPorts, etc) can be addressed.

discordianfish · 2017-11-12T16:03:17Z

contributors/design-proposals/daemonset-create-first-rolling-update.md

+
+### Considerations / questions
+
+1. How are `hostPort`s handled?


Since such use case brought me here I'm wondering if my use case and similar onces could be solved by using a special form of NodePort that doesn't redirect to any pod but only pods on the same node. This would avoid having to use hostPort in many cases where all you want is one pod per node that is reachable on a well known port.

I know this is beyond the scope of this, but knowing how a solution to this hostPort scenario likely would look like in the future might influence this design here.

In case you didn't know about these:

In 1.5 and newer you can use the service.beta.kubernetes.io/external-traffic=OnlyLocal annotation, and it seems like in newer versions you can do kubectl patch svc nodeport -p '{"spec":{"externalTrafficPolicy":"Local"}}' to only allow local traffic.

k8s-github-robot · 2017-12-13T08:03:04Z

This PR hasn't been active in 30 days. It will be closed in 59 days (Feb 10, 2018).

cc @diegs @janetkuo @liggitt @smarterclayton

You can add 'keep-open' label to prevent this from happening, or add a comment to keep it open another 90 days

diegs · 2018-01-08T21:59:57Z

Closing this because I'm no longer working on it and I'm not aware of what the expected future will be. We can re-open or re-visit at a later time if needed.

krmayankk · 2018-04-10T04:39:55Z

Why is this being abandoned ? seems like adding maxSurge would be ideal and at parity with Deployments

curlup · 2018-07-17T17:21:51Z

I would like to know what should be done to put that suggestion back on the table?
Can we reopen it or something?

As for concrete user examples I would like to suggest this.
A monitoring agent on a node — checking health of all the hardware or/and services as well. It seems that DaemonSet is the right choice for that.
But having (for a short period of an update) 2 agents on a node, checking and reporting, is much better than, no agent reporting for a period.

In a matter of HA for such agent, it event might be desirable to have 2 agents on a node constantly, in order to mitigate that agent unavailability or crash.

captn3m0 · 2019-03-13T10:43:37Z

We have another usecase of running ingress controller as a daemonset on a separate set of nodes. Version Updates should not result in a scenario where a node does not have a single pod running, but it currently does.

* Add Zhuzhenghao as member Signed-off-by: zhuzhenghao <[email protected]> * Update members.yaml --------- Signed-off-by: zhuzhenghao <[email protected]> Co-authored-by: Craig Box <[email protected]>

Add "DaemonSet create-first rolling update" design proposal.

4cbe0a4

This document captures the results of the discussion here: kubernetes/kubernetes#48841 Relevant feature: kubernetes/enhancements#373

This was referenced Aug 24, 2017

DaemonSet upgrade strategy: Start-then-kill kubernetes/enhancements#373

Closed

DaemonSet to support "add first, then delete" rolling update kubernetes/kubernetes#48841

Closed

k8s-github-robot assigned smarterclayton and thockin Aug 24, 2017

k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 24, 2017

liggitt reviewed Aug 24, 2017

View reviewed changes

thockin assigned liggitt and janetkuo and unassigned thockin Aug 24, 2017

lukaszo reviewed Aug 24, 2017

View reviewed changes

luxas reviewed Aug 24, 2017

View reviewed changes

kow3ns reviewed Aug 24, 2017

View reviewed changes

liggitt reviewed Aug 24, 2017

View reviewed changes

Address first round of comments.

2984dfb

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 25, 2017

diegs commented Aug 25, 2017

View reviewed changes

xiang90 reviewed Aug 25, 2017

View reviewed changes

janetkuo reviewed Aug 25, 2017

View reviewed changes

Address more comments.

9b8a0e0

lpabon reviewed Aug 28, 2017

View reviewed changes

diegs commented Aug 29, 2017

View reviewed changes

kow3ns mentioned this pull request Oct 13, 2017

Workload API kubernetes/enhancements#353

Closed

smarterclayton mentioned this pull request Oct 17, 2017

DaemonSet can create multiple pods per node during failover, does not give at-most-one semantics kubernetes/kubernetes#54096

Closed

roberthbailey mentioned this pull request Oct 23, 2017

DaemonSets: add SurgingRollingUpdate strategy kubernetes/kubernetes#51161

Closed

discordianfish reviewed Nov 12, 2017

View reviewed changes

diegs closed this Jan 8, 2018

janetkuo mentioned this pull request Aug 1, 2019

[WIP] Enhance daemonset on rollingupdate. kubernetes/enhancements#1140

Closed


		### Considerations / questions

		1. How are `hostPort`s handled?


		1. How are `hostPort`s handled?

		They are not handled as part of this proposal. We can either:


		They are not handled as part of this proposal. We can either:

		1. Attempt to determine a mechanism to handle this (unclear)


		TODO(diegs): investigate this.

		3. How are other shared resources (e.g. local devices, GPUs, specific core


		This feature is a blocker for supporting self-hosted Kubernetes in `kubeadm`.

		### Implementation plan

Add "DaemonSet create-first rolling update" design proposal. #977

Add "DaemonSet create-first rolling update" design proposal. #977

Conversation

diegs commented Aug 24, 2017

k8s-ci-robot commented Aug 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt Aug 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kow3ns Aug 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kow3ns Aug 24, 2017 • edited Loading

Choose a reason for hiding this comment

lukaszo Aug 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kow3ns Aug 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

diegs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt Aug 24, 2017 •

edited

Loading

kow3ns Aug 24, 2017 •

edited

Loading

kow3ns Aug 24, 2017 •

edited

Loading

lukaszo Aug 24, 2017 •

edited

Loading

kow3ns Aug 24, 2017 •

edited

Loading