Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Handle the implied change from zero-key to no-key (or vice-versa) as if it were a move statement #29605

Merged
merged 4 commits into from
Sep 20, 2021

Conversation

apparentlymart
Copy link
Contributor

@apparentlymart apparentlymart commented Sep 17, 2021

Going back a long time we've had a special magic behavior which tries to recognize a situation where a module author either added or removed the count argument from a resource that already has instances, and to silently rename the zeroth or no-key instance so that we don't plan to destroy and recreate the associated object.

We previously handled this as a post-processing step with a strange extra graph node -- an instance of NodeCountBoundary.

Now we have a more general idea of "move statements" we can treat this magic renaming rule as just another "move statement", special only in that Terraform generates it automatically rather than it being written out explicitly in the configuration.

This set of changes therefore first introduces the concept of an "implied" move statement, replicating the same decision logic previously used by the NodeCountBoundary.Execute method, and then includes the implied move statements in the pre-planning move execution so that we can safely remove the post-processing graph node.


This also includes a fix to refactoring.ApplyMoves, which was previously not handling the situation of being asked to move two objects to the same address. Previously it would panic, whereas now it'll just ignore the conflicting move statement. (Hopefully in future we'll make it generate a user-visible warning too, but we'd need to change the refactoring.ApplyMoves API to achieve that and we're already planning to do that for other reasons in a separate PR.)

This can happen in the explicit move case if we have a chain of moves and some edge cases have previously caused two addresses in the chain to both be present, or if we have moves both of entire modules and of individual resources that end up overlapping with one another.

refactoring.ImpliedMoveStatements also intentionally creates that situation if we have both a zero-key and a no-key instance of a resource that's still in the configuration, because we want refactoring.ApplyMoves to treat that collision exactly the same as an explicit one, so we have that logic encoded in only one place.

This is similar to the existing SelectsModule method, returning true if
the reciever selects either a particular resource as a whole or any of the
instances of that resource.

We don't need this test in the normal case, but we will need it in a
subsequent commit when we'll be possibly generating _implied_ move
statements between instances of resources, but only if there aren't
explicit move statements mentioning those resources already.
This new function complements the existing function FindMoveStatements
by potentially generating additional "implied" move statements that aren't
written explicit in the configuration but that we'll infer by comparing
the configuration and te previous run state.

The goal here is to infer only enough to replicate the effect of the
"count boundary fixup" graph node (terraform.NodeCountBoundary) that we
currently use to deal with this concern of preserving the zero-instance
when switching between "count" and not "count".

This is just dead code for now. A subsequent commit will introduce this
into the "terraform" package while also removing
terraform.NodeCountBoundary, thus achieving the same effect as before but
in a way that'll get reported in the UI as a move, using the same language
that we'd use for an explicit move statement.
Per our rule that the content of the state can never make a move statement
invalid, our behavior for two objects trying to occupy the same address
will be to just ignore that and let the object already at the address
take priority.

For the moment this is silent from an end-user perspective and appears
only in our internal logs. However, I'm hoping that our future planned
adjustment to the interface of this function will include some way to
allow reporting these collisions in some end-user-visible way, either as
a separate warning per collision or as a single warning that collects
together all of the collisions into a single message somehow.

This situation can arise both because the previous run state already
contained an object at the target address of a move and because more than
one move ends up trying to target the same location. In the latter case,
which one "wins" is decided by our depth-first traversal order, which is
in turn derived from our chaining and nesting rules and is therefore
arbitrary but deterministic.
@apparentlymart apparentlymart force-pushed the f-implicit-moved branch 2 times, most recently from 8d38592 to f5281ab Compare September 17, 2021 23:09
Going back a long time we've had a special magic behavior which tries to
recognize a situation where a module author either added or removed the
"count" argument from a resource that already has instances, and to
silently rename the zeroth or no-key instance so that we don't plan to
destroy and recreate the associated object.

Now we have a more general idea of "move statements", and specifically
the idea of "implied" move statements which replicates the same heuristic
we used to use for this behavior, we can treat this magic renaming rule as
just another "move statement", special only in that Terraform generates it
automatically rather than it being written out explicitly in the
configuration.

In return for wiring that in, we can now remove altogether the
NodeCountBoundary graph node type and its associated graph transformer,
CountBoundaryTransformer. We handle moves as a preprocessing step before
building the plan graph, so we no longer need to include any special nodes
in the graph to deal with that situation.

The test updates here are mainly for the graph builders themselves, to
acknowledge that indeed we're no longer inserting the NodeCountBoundary
vertices. The vertices that NodeCountBoundary previously depended on now
become dependencies of the special "root" vertex, although in many cases
here we don't see that explicitly because of the transitive reduction
algorithm, which notices when there's already an equivalent indirect
dependency chain and removes the redundant edge.

We already have plenty of test coverage for these "count boundary" cases
in the context tests whose names start with TestContext2Plan_count and
TestContext2Apply_resourceCount, all of which continued to pass here
without any modification and so are not visible in the diff. The test
functions particularly relevant to this situation are:
 - TestContext2Plan_countIncreaseFromNotSet
 - TestContext2Plan_countDecreaseToOne
 - TestContext2Plan_countOneIndex
 - TestContext2Apply_countDecreaseToOneCorrupted

The last of those in particular deals with the situation where we have
both a no-key instance _and_ a zero-key instance in the prior state, which
is interesting here because to exercises an intentional interaction
between refactoring.ImpliedMoveStatements and refactoring.ApplyMoves,
where we intentionally generate an implied move statement that produces
a collision and then expect ApplyMoves to deal with it in the same way as
it would deal with all other collisions, and thus ensure we handle both
the explicit and implied collisions in the same way.

This does affect some UI-level tests, because a nice side-effect of this
new treatment of this old feature is that we can now report explicitly
in the UI that we're assigning new addresses to these objects, whereas
before we just said nothing and hoped the user would just guess what had
happened and why they therefore weren't seeing a diff.

The backend/local plan tests actually had a pre-existing bug where they
were using a state with a different instance key than the config called
for but getting away with it because we'd previously silently fix it up.
That's still fixed up, but now done with an explicit mention in the UI
and so I made the state consistent with the configuration here so that the
tests would be able to recognize _real_ differences where present, as
opposed to the errant difference caused by that inconsistency.
@apparentlymart apparentlymart marked this pull request as ready for review September 18, 2021 00:20
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants