Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eks: unable to tag ASG of the Cluster #29280

Closed
andreprawira opened this issue Feb 27, 2024 · 6 comments
Closed

eks: unable to tag ASG of the Cluster #29280

andreprawira opened this issue Feb 27, 2024 · 6 comments
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. effort/medium Medium work item – several days of effort p2

Comments

@andreprawira
Copy link

Describe the bug

I'm trying to tag the ASG of the cluster and the cluster itself, below is my code

tags = {'CreatedBy': 'bob', 'backup': 'week'}
autospotting_tags = {'spot-enabled': 'true'}
final_tags = {**tags, **autospotting_tags}

self.cluster = eks.Cluster(
    self,
    "eks-cluster",
    version=eks.KubernetesVersion.V1_28,
    kubectl_layer=lambda_layer_kubectl_v28.KubectlV28Layer(
        self, "kubectl-layer"
    ),
    place_cluster_handler_in_vpc=True,
    cluster_name=f"{props.customer}-eks-cluster",
    default_capacity_instance=ec2.InstanceType(props.worker_node_instance_type),
    default_capacity=2,
    alb_controller=eks.AlbControllerOptions(
        version=eks.AlbControllerVersion.V2_6_2
    ),
    vpc=ec2.Vpc.from_lookup(
        self, "vpc-lookup", vpc_name=f"{props.customer}-{region}/vpc"
    ),
    vpc_subnets=[ec2.SubnetSelection(subnet_group_name="application")],
    tags=final_tags
)

self.cluster.default_nodegroup.tags = final_tags

When i deploy the stack, the cluster gets the correct final_tags However the ASG of the cluster itself does not. Anywhere i'm missing? Thanks

Expected Behavior

ASG of the cluster has the final_tags

Current Behavior

ASG does not have any tags

Reproduction Steps

Please view my code above

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.117.0 (build 59d9b23)

Framework Version

No response

Node.js Version

v18.18.0

OS

Windows

Language

Python

Language Version

Python 3.11.5

Other information

No response

@andreprawira andreprawira added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Feb 27, 2024
@github-actions github-actions bot added the @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service label Feb 27, 2024
@pahud
Copy link
Contributor

pahud commented Feb 27, 2024

This is the limitation from CFN and requires a custom resource to achieve that. I am not sure if we should include that in the aws-eks but we definitely should have a sample for that.

related to #20133

@pahud pahud added p2 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Feb 27, 2024
@pahud pahud changed the title aws_eks: unable to tag ASG of the Cluster eks: unable to tag ASG of the Cluster Feb 27, 2024
@andreprawira
Copy link
Author

andreprawira commented Feb 27, 2024

@pahud do you have the sample you mentioned?

also, i tried this

asg = self.cluster.add_auto_scaling_group_capacity(
    "spot-asg",
    instance_type=ec2.InstanceType(props.worker_node_instance_type),
)
Tags.of(asg).add("spot-enabled", "true")

and when i deploy the stack, it creates 2 ASG, one is the default node group (which already exists before this addtl code and a new one which got the correct tag. I am trying to tag the default node group or override the cluster default node group to use my customized asg (with the tag) that way it will only have 1 ASG, is it possible?

@pahud
Copy link
Contributor

pahud commented Feb 27, 2024

The reason is that the actual ASG managed by eks nodegroup is not managed by cloudformation or CDK so there's nothing we can do to propagate it to the ASG managed by the nodegroup.

Consider this workaround(need some modification):

export interface NodegroupASGModifierProps {
  readonly cluster: eks.ICluster;
  readonly nodegroup: eks.Nodegroup;
  readonly maxInstanceLifetime: Duration;
}

export class NodegroupASGModifier extends Construct {
  constructor(scope: Construct, id: string, props: NodegroupASGModifierProps) {
    super(scope, id);

    const onEventHandler = new lambda.Function(this, 'onEventHandler', {
      handler: 'index.on_event',
      runtime: lambda.Runtime.PYTHON_3_9,
      code: lambda.Code.fromAsset(path.join(__dirname, '../lambda')),
    });

    const nodegroupName = (props.nodegroup.node.defaultChild as CfnNodegroup).getAtt('NodegroupName').toString();
    onEventHandler.addToRolePolicy(new iam.PolicyStatement({
      actions: ['eks:DescribeNodegroup'],
      resources: [
        Stack.of(this).formatArn({
          resource: 'nodegroup',
          service: 'eks',
          resourceName: `${props.cluster.clusterName}/${nodegroupName}/*`,
        }),
      ],
    }));
    onEventHandler.addToRolePolicy(new iam.PolicyStatement({
      actions: ['autoscaling:UpdateAutoScalingGroup'],
      resources: ['*'],
    }));

    const provider = new cr.Provider(this, 'Provider', {
      onEventHandler,
    });


    const myResource = new CustomResource(this, 'CR', {
      serviceToken: provider.serviceToken,
      resourceType: 'Custom::EKSNodegroupModifier',
      properties: {
        clusterName: props.cluster.clusterName,
        nodegroupName,
        maxInstanceLifetime: props.maxInstanceLifetime.toSeconds(),
      },
    });

    const asgName = myResource.getAtt('asg_name').toString();
    new CfnOutput(this, 'ASGName', { value: asgName });

  };
}

And the lambda

import boto3, json

def update_max_instance_lifetime(asg_name, lifetime):
    client = boto3.client('autoscaling')
    return client.update_auto_scaling_group(
        AutoScalingGroupName=asg_name,
        MaxInstanceLifetime=lifetime
    )

def on_event(event, context):
  print(event)
  request_type = event['RequestType']
  if request_type == 'Create': return on_create(event)
  if request_type == 'Update': return on_update(event)
  if request_type == 'Delete': return on_delete(event)
  raise Exception("Invalid request type: %s" % request_type)

def on_create(event):
  client = boto3.client('eks')  
  props = event["ResourceProperties"]
  print("create new resource with props %s" % props)
  clusterName = props.get('clusterName')
  nodegroupName = props.get('nodegroupName')
  lifetime = props.get('maxInstanceLifetime')
  response = client.describe_nodegroup(
    clusterName=clusterName,
    nodegroupName=nodegroupName,
  )
  asg_name = response['nodegroup']['resources']['autoScalingGroups'][0]['name']
  update_max_instance_lifetime(asg_name, int(lifetime))
  data = { 'asg_name': asg_name }

  return { 'Data': data }

def on_update(event):
    return on_create(event)
#   physical_id = event["PhysicalResourceId"]
#   props = event["ResourceProperties"]
#   print("update resource %s with props %s" % (physical_id, props))
#   # ...

def on_delete(event):
  physical_id = event["PhysicalResourceId"]
  print("delete resource %s" % physical_id)

Let me know if it works for you.

@andreprawira
Copy link
Author

@pahud we were able to finally tag the ASG, we did it by overriding the default node group by using eks.Nodegroup afterwards we create a CustomResource and write the lambda function to find the ASG created by eks.Nodegroup and tag it. I need to note that the CR lambda function have to wait until the ASG is created, otherwise it wont be able to find it

Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@nielstenboom
Copy link

I ran into the same problem trying to set GPU enabled nodegroups that scale from 0

I found this unknown project https:/isotoma/eks-nodegroup-asg-tags-cdk (thanks @plumdog!) , and seemed to work out of the box to me after applying! Leaving the setup here below for those running into the same problem.

Example from README from aforementioned project:

import { NodegroupAsgTags } from 'eks-nodegroup-asg-tags-cdk';

// ...

const myCluster = ...
const myNodegroupProps = {...};
const myNodegroup = myCluster.addNodegroupCapacity(..., myNodegroupProps);

new NodegroupAsgTags(this, 'MyNodegroupTags', {
    cluster: props.cluster,
    nodegroup: myNodegroup,
    nodegroupProps,
    setClusterAutoscalerTagsForNodeLabels: true,
    setClusterAutoscalerTagsForNodeTaints: true,
    tags: {
        'k8s.io/cluster-autoscaler/node-template/autoscaling-options/scaledownunneededtime': '1m0s',
    },
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. effort/medium Medium work item – several days of effort p2
Projects
None yet
Development

No branches or pull requests

3 participants