Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix Launch Templates error with aws 2.61.0 #875

Merged
merged 1 commit into from
May 9, 2020

Conversation

dpiddockcmp
Copy link
Contributor

PR o'clock

Description

Updating Launch Templates with aws provider 2.61.0 generates an error when creating the ASG or attempting to launch new instances. This is caused by the aws provider adding support to the launch template for placement.partition_number. I think the root issue is go-aws-sdk defaulting this to 0 as it's an integer and the module setting the placement block when no placement group is in use. Having partition_number set but no placement group name is a validation error. For example when creating a new ASG:
Error: Error creating AutoScaling Group: ValidationError: You must use a valid fully-formed launch template. You cannot use PartitionNumber with a Placement Group that does not exist. Specify a valid Placement Group and try again.

Resolve the issue by not setting the placement block if no placement group name is passed in.

This will cause an update for all users using launch templates unless they are using placement groups.

Fixes #874

Checklist

@riveryc
Copy link

riveryc commented May 9, 2020

NICE, this fixed my issue!

@dpiddockcmp
Copy link
Contributor Author

I've done a little more testing and this fixes launch templates for people who do not use placement groups.

However there's a bug in the provider or SDK. Placement groups specified directly in the launch template only work if the placement group is of type partition because partition_number is set. Except we don't expose the ability to set the partition number. I don't want to do that as it will require a bump to the broken provider version 2.61.0.

Non-partition placement groups can still be specified on the ASG via placement_group variable.

module "eks" {
  # ...
  worker_groups_launch_template = [
    {
      name                 = "worker-group-1"
      instance_type        = "m5.large"
      asg_desired_capacity = 1
      public_ip            = true
      placement_group      = aws_placement_group.this.id
    },
    {
      name                            = "worker-group-2"
      instance_type                   = "m5.large"
      asg_desired_capacity            = 1
      public_ip                       = false
      launch_template_placement_group = aws_placement_group.this.id
    },
    {
      name                 = "worker-group-3"
      instance_type        = "t2.small"
      asg_desired_capacity = 1
      public_ip            = false
    },
  ]
}
resource "aws_placement_group" "this" {
  name     = "testgroup"
  strategy = "cluster"
}

worker-group-1 and worker-group-3 are created without issue. worker-group-2 fails with:

Error: Error creating AutoScaling Group: ValidationError: You must use a valid
fully-formed launch template. You cannot use PartitionNumber with a Placement
Group that does not use the 'partition' strategy. Placement Group 'testgroup'
uses 'cluster' strategy. Specify a different Placement Group and try again.

I guess users wanting to set a non-partition placement group on the launch template will have to pin to aws 2.60.0 until it's fixed?

@barryib
Copy link
Member

barryib commented May 9, 2020

I guess users wanting to set a non-partition placement group on the launch template will have to pin to aws 2.60.0 until it's fixed?

Can we just pin this in versions.tf ? Something like aws = ">= 2.52.0,<= 2.60.0" or aws = "~> 2.60.0" until it's fixed upstream ?

@dpiddockcmp
Copy link
Contributor Author

dpiddockcmp commented May 9, 2020

We could but I don't feel that it is the correct solution in this situation. The issue is a provider/sdk bug. By pinning to < 2.61 we stop users from being able to use future fixed versions. There are other fixes and features in the provider.

The remaining issue only impacts users using placement groups. They can either pin the provider to 2.60.0 or pass in placement_group instead of launch_template_placement_group.

We can have a future release to support the new placement groups with partitions once the provider is fixed. Looks like there is already a PR for the provider issue hashicorp/terraform-provider-aws#13236

@barryib
Copy link
Member

barryib commented May 9, 2020

Agree in general. But it's not a definitive pinning. I was thinking to pin the module version temporally, like we already did with the kubernetes provider.

Furthermore, we should consider bumping the aws provider version. Maybe after hashicorp/terraform-provider-aws#13239 is merged ?

@barryib
Copy link
Member

barryib commented May 9, 2020

@dpiddockcmp do you still want to merge this PR ?

@kmgreen2
Copy link

kmgreen2 commented May 9, 2020

Thanks! Worked for me as well!

@dpiddockcmp
Copy link
Contributor Author

Kubernetes provider was slightly different as it broke default functionality used by probably the majority. At the time it was also the "expected behaviour" from the provider maintainer.

This PR fixes the issue for probably the majority of launch template users. There's a PR on the provider that fixes the real issue. Hopefully it gets released in the normal cycle on Thursday.

I think we should merge this and get 12.0.0 released.

@barryib barryib changed the title fix: Launch Templates error with aws 2.61.0 fix: Fix Launch Templates error with aws 2.61.0 May 9, 2020
@barryib barryib merged commit bb822a1 into terraform-aws-modules:master May 9, 2020
@murty0
Copy link

murty0 commented May 13, 2020

I am experiencing the same issue with aws provider version "~> 2.59" and eks module version "~> 10.0".

We already have a cluster running in production which did not have a placement group, but is working fine. I am in the midst of creating a testing cluster, with the above mentioned provider versions, and again with no placement groups defined. But I am getting the following errors:

Error: Error creating AutoScaling Group: ValidationError: You must use a valid fully-formed launch template. You cannot use PartitionNumber with a Placement Group that does not exist. Specify a valid Placement Group and try again.
  status code: 400, request id: 8760f1ce-69ea-4457-83a3-37eee9e11804

  on .terraform/modules/eks_cluster/terraform-aws-eks-10.0.0/workers_launch_template.tf line 3, in resource "aws_autoscaling_group" "workers_launch_template":
   3: resource "aws_autoscaling_group" "workers_launch_template" {

I have already posted this here: hashicorp/terraform-provider-aws#13236

@dpiddockcmp dpiddockcmp deleted the fix/874 branch May 13, 2020 17:17
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ASG Failed scale up after update aws_launch_template with aws provider 2.61.0 (currently latest version)
6 participants