Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform apply not idempotent for security groups #13966

Closed
SanchitBansal opened this issue Apr 26, 2017 · 19 comments
Closed

Terraform apply not idempotent for security groups #13966

SanchitBansal opened this issue Apr 26, 2017 · 19 comments
Assignees

Comments

@SanchitBansal
Copy link

Terraform Version

0.9.3

Affected Resource(s)

  • aws_security_group

Terraform Configuration Files

resource "aws_security_group" "cassandra"
{
  name             = "prod"
  description 	= "Security group cassandra"
  vpc_id      	= "${aws_vpc.main.id}"

  // allows traffic from the SG itself for tcp
    ingress {
        from_port = 0
        to_port = 65535
        protocol = "tcp"
        self = true
    }

    // allows traffic from the SG itself for udp
    ingress {
        from_port = 0
        to_port = 65535
        protocol = "udp"
        self = true
    }

    // allow traffic for TCP 9042 (Cassandra clients)
    ingress {
        from_port = 9042
        to_port = 9042
        protocol = "tcp"
        cidr_blocks = ["${data.aws_subnet.public.*.cidr_block}"]
    }

    // allow traffic for TCP 9160 (Cassandra Thrift clients)
    ingress {
        from_port = 9160
        to_port = 9160
        protocol = "tcp"
        cidr_blocks = ["${data.aws_subnet.public.*.cidr_block}"]
    }

    // allow traffic for TCP 7199 (JMX)
    ingress {
        from_port = 7199
        to_port = 7199
        protocol = "tcp"
        cidr_blocks = ["${data.aws_subnet.public.*.cidr_block}"]
    }

  depends_on = ["data.aws_subnet.public"]
  tags {
    Name        = "prod-sg-cassandra"
    Environment = "prod"
    Type	= "cassandra"
  }
}

Debug Output

https://gist.github.com/SanchitBansal/2683c645360b8ee31978cfa75e4d7abe

Panic Output

https://gist.github.com/SanchitBansal/3c034d8380ed4e0f6f7d089cf3164979

Expected Behavior

During first time "terraform apply", it launched the complete infra and I was expecting it to just refresh the state on second time "terraform apply". Means Terraform should execute smoothly in case of multiple "terraform apply"

Actual Behavior

During first time, it executed successfully but second time it gave me error related to security group difference did not match.

Steps to Reproduce

  1. terraform apply
  2. terraform apply
@jbardin
Copy link
Member

jbardin commented Apr 30, 2017

Hi @SanchitBansal,

Sorry you're having a problem here, but I'm not able to reproduce this issue with the config you've provided with Terraform 0.9.3 or the latest build.

It may be related to the definition of data.aws_subnet.public, can you provide a more complete configuration to reproduce this?

Also, though I don't think it affects the issue, you don't need to add depends_on = ["data.aws_subnet.public"] when you already are referencing data.aws_subnet.public in the resource. You should rarely need depends_on at all, and putting in a data source can effect how that data source works.

@shamimgeek
Copy link

@jbardin : i am also facing this issue with
terraform version : Terraform v0.9.4
resource: aws_security_group_rule

@SanchitBansal
Copy link
Author

@jbardin Sharing below the required configuration

data "aws_subnet" "elb" {
  vpc_id = "${var.vpc_id}"
  filter {
    name = "tag:role"
    values = ["elb"]
  }
  filter {
    name = "tag:az"
    values = ["ap-south-1a"]
  }
  count = "${length(var.availability_zones)}"
  depends_on = ["aws_subnet.public"]
}
resource "aws_subnet" "public" {
  vpc_id            = "${var.vpc_id}"
  cidr_block        = "192.168.0.1/28"
  availability_zone = "ap-south-1a"

  tags {
    Name = "dev-elb-public-1a-1"
    role = "elb"
    az   = "ap-south-1a"
  }
}

@jbardin
Copy link
Member

jbardin commented May 1, 2017

Thanks @SanchitBansal, I was able to reproduce the error with the help of the added config .

What's causing the error is actually the depends_on value in the the data.aws_subnet.public datasource. Adding depends_on to a datasource prevents the datasource from being loaded early on, because terraform has no way to know why you've added depends_on so it has to wait until apply. If the data source really does depend on the resource (though I'm not sure why you have data sources for resources that already exist in your config), you could reference an attribute via interpolation, like:

data "aws_subnet" "elb" {
  vpc_id = "${var.vpc_id}"
  filter {
    name = "tag:role"
    values = ["elb"]
  }
  filter {
    name = "tag:az"
    values = ["${aws_subnet.public.availability_zone"]
  }
}

Your cassandra config above also does not need the depends_on block, since you're already referencing the same security group in the ingress rules.

This is still a bug in terraform, as terraform apply should complete without error, but the fact that the plan can't resolve the data source because of depends_on is expected.

@jbardin jbardin self-assigned this May 1, 2017
@shamimgeek
Copy link

I am using terraform version
Terraform v0.9.4 with below configuration and i see idempotent issue with resource :

provider "aws" {
  access_key = ""
  secret_key = ""
  insecure  = true
  skip_credentials_validation = true
  skip_region_validation = true
  region = "eucalyptus"
  endpoints {
    ec2 = "xxxxxxxxxxxxxxxxxxxxxxx"
    iam = "xxxxxxxxxxxxxxxxxxxxxxx"
    elb = "xxxxxxxxxxxxxxxxxxxx"
  }
}

resource "aws_security_group" "mesos-masters-sakhtar2" {
  name        = "mesos-masters-sakhtar2"
  description = "Security Group for mesos masters of PaaS sakhtar2"

  ingress {
    from_port = 22    to_port = 22    protocol = "tcp" cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "mesos-slaves-sakhtar2" {
  name        = "mesos-slaves-sakhtar2"
  description = "Security Group for mesos slaves of PaaS sakhtar2"

  ingress {
    from_port = 22    to_port = 22    protocol = "tcp" cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group_rule" "allow53tcp" {
    type = "ingress"
    from_port = 53
    to_port = 53
    protocol = "tcp"
    security_group_id = "${aws_security_group.mesos-masters-sakhtar2.id}"
    source_security_group_id = "${aws_security_group.mesos-slaves-sakhtar2.name}"

}

resource "aws_security_group_rule" "allow53udp" {
    type = "ingress"
    from_port = 53
    to_port = 53
    protocol = "udp"
    security_group_id = "${aws_security_group.mesos-masters-sakhtar2.id}"
    source_security_group_id = "${aws_security_group.mesos-slaves-sakhtar2.name}"

}

command output:
https://gist.github.com/shamimgeek/2b11da238795f195f7568ab0a8780775

@jbardin
Copy link
Member

jbardin commented May 1, 2017

Hi @shamimgeek,

This is a different issue from the original attribute mismatch error.
Can you file a new issue with the example provided? I thought there was an open issue already, but I don't see it offhand.

@shamimgeek
Copy link

shamimgeek commented May 1, 2017

@jbardin: sure. i have opened new issue

#14124

@SanchitBansal
Copy link
Author

@jbardin I tried by removing depends_on block and working fine for now.. Actually in few cases terraform was not picking up the references by itself so I started defining dependencies in all configurations :)
Thanks a lot for your help... I will let you know in case the same error comes again even without using depends_on.

@jbardin
Copy link
Member

jbardin commented May 2, 2017

@SanchitBansal,

Glad it works! I'm actually going to keep this open because it led me to a reproduction case with a "diffs didn't match" error.

@jbardin jbardin reopened this May 2, 2017
@akio-outori
Copy link

@jbardin I'm having a related issue, and it seems to currently be by design. Every time I terraform apply vpc security groups forces a new resource. Here's the config:

resource "aws_security_group" "master" {

  ingress {
    from_port   = "80"
    to_port     = "80"
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port   = "443"
    to_port     = "443"
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port   = "0"
    to_port     = "65535"
    protocol    = "tcp"
    cidr_blocks = ["${data.terraform_remote_state.networking.vpc_cidr_block}"]
  }

  ingress {
    from_port   = "22"
    to_port     = "22"
    protocol    = "tcp"
    cidr_blocks = ["${data.terraform_remote_state.axis.public_ip}/32"]
  }

  egress {
    from_port   = "0"
    to_port     = "0"
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  vpc_id = "${data.terraform_remote_state.networking.vpc_id}"

  tags {
    Name        = "${data.terraform_remote_state.networking.environment}-master"
    Description = "ports required for the DF Master instance"
    Environment = "${data.terraform_remote_state.networking.environment}" 
  }
}

resource "aws_security_group" "slave" {

  ingress {
    from_port   = "0"
    to_port     = "65535"
    protocol    = "tcp"
    cidr_blocks = ["${data.terraform_remote_state.networking.vpc_cidr_block}"]
  }

  ingress {
    from_port   = "22"
    to_port     = "22"
    protocol    = "tcp"
    cidr_blocks = ["${data.terraform_remote_state.axis.public_ip}/32"]
  }

  egress {
    from_port   = "0"
    to_port     = "0"
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  vpc_id = "${data.terraform_remote_state.networking.vpc_id}"

  tags {
    Name        = "${data.terraform_remote_state.networking.environment}-slave"
    Description = "ports required for the DF Master instance" 
    Environment = "${data.terraform_remote_state.networking.environment}"
  }
} 

Based on the docs it looks like name, description, and vpc_id all force a new resource for security groups, which leads to my instances being terminated. I removed the name and description directives, but obviously get an error when removing the vpc_id directive.

Can you offer any insight into what the correct way to configure idempotent security groups would be?

@kurron
Copy link

kurron commented Oct 20, 2017

I believe I am seeing this in Terraform v0.10.7. I have 2 aws_security_group_rule hanging off a aws_security_group, each providing a description. I can create from scratch just fine but if I run plan a second time, Terraform wants to update/swap description fields for some reason.

resource "aws_security_group" "ec2_access" {
    name_prefix = "ec2-"
    description = "Controls access to the EC2 instances"
    vpc_id      = "${var.vpc_id}"
    tags {
        Name        = "EC2 Access"
        Project     = "${var.project}"
        Purpose     = "Controls access to the EC2 instances"
        Creator     = "${var.creator}"
        Environment = "${var.environment}"
        Freetext    = "${var.freetext}"
    }
    lifecycle {
        create_before_destroy = true
    }
}

resource "aws_security_group_rule" "ec2_ingress_bastion" {
    type                     = "ingress"
    from_port                = 0
    protocol                 = "all"
    security_group_id        = "${aws_security_group.ec2_access.id}"
    source_security_group_id = "${aws_security_group.bastion_access.id}"
    to_port                  = 65535
    description              = "Only allow traffic from the Bastion boxes"
    lifecycle {
        create_before_destroy = true
    }
}

resource "aws_security_group_rule" "ec2_ingress_alb" {
    type                     = "ingress"
    from_port                = 0
    protocol                 = "all"
    security_group_id        = "${aws_security_group.ec2_access.id}"
    source_security_group_id = "${aws_security_group.alb_access.id}"
    to_port                  = 65535
    description              = "Only allow traffic from the load balancers"
    lifecycle {
        create_before_destroy = true
    }
}

plan wants to make this change:

terraform show debug/proposed-changes.plan
  ~ module.security-group.aws_security_group_rule.ec2_ingress_alb
      description: "Only allow traffic from the Bastion boxes" => "Only allow traffic from the load balancers"

If I start fresh, commenting out the description attributes, I can run plan as many times as I want and Terraform rightfully thinks that no changes have to be applied.

kurron added a commit to kurron/terraform-aws-security-groups that referenced this issue Oct 20, 2017
@jbardin
Copy link
Member

jbardin commented Oct 20, 2017

@kurron,

That's an interesting error too, which may be a provider issue, but I'll leave this here for now until we can investigate further.

Extra notes: not only is the diff somehow getting the incorrect description field, but running apply again fails with an error that from-port isn't allowed, and destroying fails on the first attempt with rule does not exist

@talbright
Copy link

talbright commented Nov 2, 2017

Terraform v0.10.8

I can confirm similar behavior as described by @kurron

After my first plan and apply, with no changes to my TF files or the state of the resources in AWS:

  • Terraform plan wants to rename some of my aws_security_group_rule.descriptions
  • Terraform plan wants to rename these descriptions incorrectly
  • Terraform plan always using the same source description for the destination description
  • Plan apply succeeds
  • After plan apply no visible description changes are made

Snippet from plan:

  ~ module.table.aws_security_group_rule.ec2_admin_rdp_theirco_cidr01
      description:                       "MYCo: OFFICE1 IP block" => "TheirCo: 100.xxx.xxx.xxx"

  ~ module.table.aws_security_group_rule.ec2_admin_rdp_theirco_cidr02
      description:                       "MYCo: OFFICE1 IP block" => "TheirCo: 111.xxx.xxx.xxx"

  ~ module.table.aws_security_group_rule.ec2_admin_rdp_theirco_cidr03
      description:                       "MyCo: OFFICE1 IP block" => "TheirCo: 222.xxx.xxx.xxx"

@battenworks
Copy link

Terraform v0.10.8

I can also confirm this behavior. I have several aws_security_group_rule resources, and Terraform wants to update the description fields for all but one of the resources on each plan/apply.

It seems to be in the logic that creates the .tfstate on apply. While the description fields on each inbound rule are correctly applied in AWS, each resource has the same description value written to the .tfstate , so when we do a plan/apply, Terraform needs to change them. Terraform then incorrectly applies the same description to all the resources in the .tfstate again.

@pf-curtis-mitchell
Copy link

There is a pr that addresses the state problem in the aws provider.

@jbardin
Copy link
Member

jbardin commented Nov 16, 2017

The error I reopened this for has since been fixed, so closing it back out once and for all.

@jbardin jbardin closed this as completed Nov 16, 2017
@rickkbarbosa
Copy link

Hello all,

Sorry reopen this case, but I think I can help you a little bit more to reproduce this error.

I have the same problem here. It's related with something involving Ingress descriptions when I've tried append multiple ingress roles with the same description.

Sor some reason, when I've tried retry terraform apply, tfstate didn't read the previous change corretly.

Hope it helps,

@Dmitry1987
Copy link

Happens to me when a resource already exist which was not created by Terraform (same name). Instead of proceeding the TF fails. I have a "aws_security_group" which already exists, but in case it doesn't it needs to be created. Is it a correct behavior of TF? Or I can flag it to "skip if exists" somehow?

@ghost
Copy link

ghost commented Apr 2, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 2, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests