Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: worker security group handling when worker_create_security_group=false #1461

Merged
merged 2 commits into from
Sep 6, 2021

Conversation

sunghospark-calm
Copy link
Contributor

@sunghospark-calm sunghospark-calm commented Jun 25, 2021

PR o'clock

Description

Currently when you enable create_launch_template=true on the nodegroup, it switches from using primary cluster security group to worker security group which can cause various problems.

The first work around attempted was to use cluster_primary_security_group_id as worker_security_group_id, but that caused variable cycle error.

Error: Cycle: module.eks.local.cluster_primary_security_group_id (expand), module.eks.output.cluster_primary_security_group_id (expand), module.eks.var.worker_security_group_id (expand), module.eks.local.worker_security_group_id (expand), module.eks.aws_security_group_rule.cluster_https_worker_ingress, module.eks.aws_eks_cluster.this

The second workaround was to add the primary security group as worker_additional_security_group_ids, however that causes the ALB ingress controller to fail in AWS because the ingress controller cannot work when there are multiple SG attached to the network interface.

The solution was to delete the unused worker security group, however that causes the failure on launch template creation due to the empty security group id.

This PR cleans up the security groups for launch template if theres an empty string in the list.
There was also a cluster_private_access_sg_source rule which was trying to add the worker security group when the worker security group didn't exist, which is fixed in this PR.

Checklist

@sunghospark-calm sunghospark-calm changed the title compact the worker security group to remove empty group id improvement: compact the worker security group to remove empty group id Jun 25, 2021
@sunghospark-calm
Copy link
Contributor Author

Hello @barryib @max-rocket-internet , I'm hoping to get some reviews on this PR. Please let me know if I should be pinging different person or channel. Thank you for maintaining this project.

@sunghospark-calm sunghospark-calm changed the title improvement: compact the worker security group to remove empty group id bug: fix worker security group handling when worker_create_security_group=false Jun 25, 2021
@sunghospark-calm sunghospark-calm changed the title bug: fix worker security group handling when worker_create_security_group=false fix: worker security group handling when worker_create_security_group=false Jun 25, 2021
@yafanasiev
Copy link

Just hit this as well as #1447. Had to fork the base repo, and everything now works beautifully. Hope this gets merged soon.

@aidan-mundy
Copy link
Contributor

aidan-mundy commented Aug 16, 2021

I am having the same issue and this appears to be a direct fix for the problem, would appreciate a speedy merge on this small PR.

EDIT: Just want to confirm that my problems were solved by using the source fork for this PR as my source for the module.

@yafanasiev
Copy link

@antonbabenko @bryantbiggs please take a look at this PR if you can. We tested this in our fork of the module, and fix works just as expected.

@yafanasiev
Copy link

This is the only PR preventing us to go back from fork to official module. I can confirm that the fix is actually working and battle-tested. Please, review this 😢

@antonbabenko
Copy link
Member

Hi @yafanasiev ! I will review it during this week and merge it. Sorry that it takes so long time.

@antonbabenko
Copy link
Member

I couldn't make it to work on an existing cluster created with examples/launch_templates.

Error messages are like this:

╷
│ Error: Error updating Auto Scaling Group: ValidationError: You must use a valid fully-formed launch template. Value () for parameter groupId is invalid. The value cannot be empty
│       status code: 400, request id: 00236532-0ebb-412d-9f64-289d0ff947c2
│ 
│   with module.eks.aws_autoscaling_group.workers_launch_template[2],
│   on ../../workers_launch_template.tf line 3, in resource "aws_autoscaling_group" "workers_launch_template":
│    3: resource "aws_autoscaling_group" "workers_launch_template" {
│ 
╵

Could you please try to create a cluster using examples, then apply this PR (add worker_create_security_group = false and other required inputs), and run terraform plan/apply to confirm that it works as expected.

Please update the example, if necessary.

@sunghospark-calm
Copy link
Contributor Author

Thanks for the review @antonbabenko .
I could confirm that just adding worker_create_security_group = false alone did not work with example/launch_templates
The required blob was

  worker_create_security_group = false
  worker_create_cluster_primary_security_group_rules = true
  worker_additional_security_group_ids = [module.eks.cluster_primary_security_group_id]

I think it makes sense because there gotta be a security group in place before the cluster to work.

I also tested with example/managed_node_groups. For this one I could just add worker_create_security_group = false and the apply ran successfully.

@antonbabenko antonbabenko merged commit 752c183 into terraform-aws-modules:master Sep 6, 2021
@antonbabenko
Copy link
Member

Thanks, @sunghospark-calm !

v17.13.0 has been just released.

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants