Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Fix crash with orphaned module instance #30151

Merged
merged 2 commits into from
Dec 14, 2021

Conversation

alisdair
Copy link
Contributor

Fixes #30110.

These commits are also on a shared working branch, which I've rebased and squashed so that we don't have a broken commit on main if this is merged. From @apparentlymart's commit comment:

Previously we were treating it as a programming error to ask for the instances of a resource inside an instance of a module that is declared but whose declaration doesn't include the given instance key.

However, that's actually a valid situation which can arise if, for example, the user has changed the repetition/expansion mode for an existing module call and so now all of the resource instances addresses it previously contained are "orphaned".

To represent that, we'll instead say that an invalid instance key of a declared module behaves as if it contains no resource instances at all, regardless of the configurations of any resources nested inside. This then gives the result needed to successfully detect all of the former resource instances as "orphaned" and plan to destroy them.

However, this then introduces a new case for NodePlannableResourceInstanceOrphan.deleteActionReason to deal with: the resource configuration still exists (because configuration isn't aware of individual module/resource instances) but the module instance does not. This actually allows us to resolve, at least partially, a previous missing piece of explaining to the user why the resource instances are planned for deletion in that case, finally allowing us to be explicit to the user that it's because of the module instance being removed, which internally we call plans.ResourceInstanceDeleteBecauseNoModule.

apparentlymart and others added 2 commits December 13, 2021 10:03
Previously we were treating it as a programming error to ask for the
instances of a resource inside an instance of a module that is declared
but whose declaration doesn't include the given instance key.

However, that's actually a valid situation which can arise if, for
example, the user has changed the repetition/expansion mode for an
existing module call and so now all of the resource instances addresses it
previously contained are "orphaned".

To represent that, we'll instead say that an invalid instance key of a
declared module behaves as if it contains no resource instances at all,
regardless of the configurations of any resources nested inside. This
then gives the result needed to successfully detect all of the former
resource instances as "orphaned" and plan to destroy them.

However, this then introduces a new case for
NodePlannableResourceInstanceOrphan.deleteActionReason to deal with: the
resource configuration still exists (because configuration isn't aware of
individual module/resource instances) but the module instance does not.
This actually allows us to resolve, at least partially, a previous missing
piece of explaining to the user why the resource instances are planned
for deletion in that case, finally allowing us to be explicit to the user
that it's because of the module instance being removed, which
internally we call plans.ResourceInstanceDeleteBecauseNoModule.

Co-authored-by: Alisdair McDiarmid <[email protected]>
…dule

This is an explicit technical debt note that our plan renderer isn't able
to give a fully-specific hint in this particular case of deletion reason.

This reason code means that at least one of the module instance keys in
the resource's module path doesn't match an instance declared in the
configuration, but the plan data structure doesn't retain enough
information to know which is the first step in the path which refers to
a missing instance, and so we just always return the whole thing.

This would be confusing if we return module.foo[0].module.bar not being
in the configuration as a result of module.foo not using "count"; it would
be better to say "module.foo[0] is not in the configuration" instead.

It would be most ideal to handle all of the different situations that
ResourceInstanceDeleteBecauseWrongRepetition's rendering does, so that we
can go further and explain exactly _why_ that module instance isn't
declared anymore.

We can do neither of those things today because only the Terraform Core
"expander" component knows that information, and we've discarded that
by the time we get to rendering a plan. To fix this one day would require
preserving in the plan information about which module instances are
declared, as a separate sidecar data structure from which resource
instances we're taking actions on, and then using that to identify which
step in addr.Module here first selects an invalid instance.
@alisdair alisdair added core 1.1-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged labels Dec 13, 2021
@alisdair alisdair requested a review from a team December 13, 2021 15:11
@alisdair alisdair self-assigned this Dec 13, 2021
Copy link
Contributor Author

@alisdair alisdair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't technically approve this because I opened the PR, but I've reviewed it carefully and this makes sense to me as the change to fix the issue.

@alisdair alisdair merged commit aa59eb4 into main Dec 14, 2021
@alisdair alisdair deleted the f-non-existing-module-instance-crash branch December 14, 2021 14:10
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
1.1-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runtime error: invalid memory address or nil pointer -- v1.1.0-rc1
3 participants