Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run Task including multiple steps with different resources requests/limits values #1933

Closed
mgreau opened this issue Jan 24, 2020 · 4 comments · Fixed by #1937
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@mgreau
Copy link
Contributor

mgreau commented Jan 24, 2020

Expected Behavior

A Task can have multiple steps with different resources requests/limits values, where the n-1 step gets higher values than the latest step.

Actual Behavior

While it works with Tekton Pipelines 0.8.0 and 0.9.2, since the 0.10.0 it is not possible to deploy a Task which defines a step 2 with higher values than the latest step.
An error message like below is thrown (in that case the spec.containers[2].resources.limit value is 512Mi but the spec.containers[3].resources.limitis 128Mi)

Message
Invalid TaskSpec: Pod "resource-request-bug-mi-pod-db877" is invalid: spec.containers[2].resources.requests: Invalid value: "256Mi": must be less than or equal to memory limit

Steps to Reproduce the Problem

The following TaskRun is used to reproduce the problem:

apiVersion: tekton.dev/v1alpha1
kind: TaskRun
metadata:
  name: resource-request-issue
spec:
  taskSpec:
    steps:
    - name: minimal-resources-values
      image: ubuntu
      script: |
        #!/usr/bin/env bash
        set -euxo pipefail
        echo "Hello from Bash using memory request 64Mi and limit to 128Mi!"
      resources:
        requests:
          memory: "64Mi"
          cpu: "100m"
        limits:
          memory: "128Mi"
          cpu: "200m"
    - name: maximal-resources-values
      image: ubuntu
      script: |
        #!/usr/bin/env bash
        set -euxo pipefail
        echo "Hello from Bash using memory request 256Mi and limit to 512Mi!"
      resources:
        requests:
          memory: "256Mi"
          cpu: "100m"
        limits:
          memory: "512Mi"
          cpu: "200m"
    - name: re-minimal-resources-values
      image: ubuntu
      script: |
        #!/usr/bin/env bash
        set -euxo pipefail
        echo "Hello from Bash using memory request 64Mi and limit to 128Mi!"
      resources:
        requests:
          memory: "64Mi"
          cpu: "100m"
        limits:
          memory: "128Mi"
          cpu: "200m"
  1. Install Tekton Pipelines 0.10.0
kubectl apply -f https:/tektoncd/pipeline/releases/download/v0.10.0/release.yaml
  1. Apply the TaskRun
kubectl apply -f https://gist.githubusercontent.com/mgreau/2e34b5e535134ee97e3a0aa3b1e3b248/raw/2a43ad6d023aa838a8e869b73f049cec0ef413e1/tekton-resource-request-issue.yaml
taskrun.tekton.dev/resource-request-issue created
  1. Check the output
tkn taskrun describe resource-request-issue                                                                                                                                                     
Name:        resource-request-issue
Namespace:   default

Status
STARTED          DURATION    STATUS
35 seconds ago   ---         Failed(CouldntGetTask)

Message
Invalid TaskSpec: Pod "resource-request-issue-pod-lmh64" is invalid: spec.containers[2].resources.requests: Invalid value: "256Mi": must be less than or equal to memory limit

Input Resources
No resources

Output Resources
No resources

Params
No params

Steps
No steps
  1. Delete Tekton Pipelines 0.10.0 and the TaskRun
$ kubectl delete namespace tekton-pipelines
namespace "tekton-pipelines" deleted

$ tkn taskrun delete resource-request-issue                                                                                                                                                       
Are you sure you want to delete taskrun "resource-request-issue" (y/n): y
TaskRun deleted: resource-request-issue
  1. Install Tekton Pipelines 0.9.2
kubectl apply -f https:/tektoncd/pipeline/releases/download/v0.9.2/release.yaml
  1. Apply the TaskRun
kubectl apply -f https://gist.githubusercontent.com/mgreau/2e34b5e535134ee97e3a0aa3b1e3b248/raw/2a43ad6d023aa838a8e869b73f049cec0ef413e1/tekton-resource-request-issue.yaml
taskrun.tekton.dev/resource-request-issue created
  1. Check the output and logs
tkn taskrun describe resource-request-issue
Name:        resource-request-issue
Namespace:   default

Status
STARTED          DURATION     STATUS
12 seconds ago   12 seconds   Succeeded

Input Resources
No resources

Output Resources
No resources

Params
No params

Steps
NAME                          STATUS
minimal-resources-values      Completed
maximal-resources-values      Completed
re-minimal-resources-values   Complete

tkn taskrun logs resource-request-issue
[minimal-resources-values] + echo 'Hello from Bash using memory request 64Mi and limit to 128Mi!'
[minimal-resources-values] Hello from Bash using memory request 64Mi and limit to 128Mi!

[maximal-resources-values] Hello from Bash using memory request 256Mi and limit to 512Mi!
[maximal-resources-values] + echo 'Hello from Bash using memory request 256Mi and limit to 512Mi!'

[re-minimal-resources-values] Hello from Bash using memory request 64Mi and limit to 128Mi!
[re-minimal-resources-values] + echo 'Hello from Bash using memory request 64Mi and limit to 128Mi!'

Additional Info

it might be related to this update #1655 but I'm not sure.

@danielhelfand
Copy link
Member

danielhelfand commented Jan 24, 2020

This issue comes from where container requests are handled for a TaskRun, which is resource_request.go.

Basically, what is happening is that the last container or step for a TaskRun is set to the maximum values for ResourceCPU, ResourceMemory, and ResourceEphemeralStorage of all the containers passed in. So, in your scenario, the final container request is set to the following:

ResourceMemory: 256Mi
ResourceCPU: 100m
ResourceEphemeralStorage: 0 (This occurs since all values are initially set to 0 unless something higher is found)

Since step 3 receives a request that is higher than its limit (memory: "128Mi"), the TaskRun fails.

Because the max value is placed in the last container without thinking about the limit, it exceeds the limit. It seems like the implementation doesn't honor limits of the last step.

@vdemeester
Copy link
Member

/kind bug

@tekton-robot tekton-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 24, 2020
@danielhelfand
Copy link
Member

Prior to #1655, it looks like the last step index wasn't used to hold the max request. It looks like what was being done was finding the index of the highest resource request for memory, cpu, and ephemeral storage. So, instead of throwing everything as a single request in to the last container, it was leaving the max resource request in the index where it originally was while zeroing out anything that is not the max.

This would make sense as then it would honor its limit. So my thought here is that the function should find the index and max value of memory, cpu, and ephemeral storage and leave those values where they are and zero out everything else.

There is still the issue of #1045, but that I can handle separately as the container requests should factor in the minimum for the limit range for containers.

@danielhelfand
Copy link
Member

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants