Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move tests_common from "dev" to top-level. #42985

Merged
merged 3 commits into from
Oct 15, 2024
Merged

Move tests_common from "dev" to top-level. #42985

merged 3 commits into from
Oct 15, 2024

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Oct 13, 2024

Follow-up after #42505 fixing teething problem with tests_common.

Originally in #42505 common test code was moved to "dev" folder, but the "dev" folder is really dedicated to "build" scripts and the problem with moving "tests_common" to the folder was that the whole "dev" folder is replaced (for non-committer PRs) with the content from the target branch.

This is done for security reasons, because we can accidentally use any of the scripts from dev in the CI build scripts and we might not notice, which will open us to a security issue where a file in "dev" coming from PR could be accidentally executed during the "pull_request_target" workflow - which would expose our secrets and GitHub Package write permissions to a contributor coming from a fork.

This change moves the files, fixes pre-commit specification and docs, also fixes a number of "doc" issues detected by "ruff" in the tests_common folder as they were detected after the move. The tests_common folder is added to folders mounted when breeze is executed with local folders mounted (in order to avoid accidental mounting of randomly generated files to inside the breeze container).

All imports for the common tests were updated to reflect this move.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Oct 13, 2024

This fixes the problem attempted to be fixed in #42971 by @gopidesupavan who experienced this issue in #42081

@potiuk
Copy link
Member Author

potiuk commented Oct 13, 2024

Unfortunately it's huge and we can't do much about it :(

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some sanity question - a LOT of copy & paste... smells some decorators or generaotrs could remove a lot of duplication?

Crossing fingers you get it green.

For easier review... might be better to make some smaller incremental moves w/o any content changes, then git diff is faster to scroll through :-D

@potiuk
Copy link
Member Author

potiuk commented Oct 14, 2024

For easier review... might be better to make some smaller incremental moves w/o any content changes, then git diff is faster to scroll through :-D

Good point. And really reasonable.

I tried to make it smaller, but all the necessary changes result from the move (i.e. in pretty much all files import will have to change. And then our pre-commit/formatting also intervenes. The comment from @ashb will likely limit the number of content changes. So in general we won't avoid content changes.

But - yes - I could split it to several commits though if you think it help to review it - I will attempt to do it, that will be rather easy. It's also a good exercise with some git commands to produce such commits :D so let me exercise my fingers.

I could also separate docstring fixes - dev has different rules than the "actual" code - including tests, but I figured that it's docstring only and there are not many of those - mostly empty lines and dots and changing to imperative statements in the first line. But I will see if that is easy - if not I would leave them together with the import changes.

@jscheffl
Copy link
Contributor

I could also separate docstring fixes - dev has different rules than the "actual" code - including tests, but I figured that it's docstring only and there are not many of those - mostly empty lines and dots and changing to imperative statements in the first line. But I will see if that is easy - if not I would leave them together with the import changes.

That was not meant to be a blocker for the current PR but to have more friends for the next one :-D

@potiuk
Copy link
Member Author

potiuk commented Oct 14, 2024

That was not meant to be a blocker for the current PR but to have more friends for the next one :-D

I split it REGARDLESS -> There are 3 commits now - move, docs, make it works. You can re-review now, I 🤞 it's going to pass tests now.

@potiuk
Copy link
Member Author

potiuk commented Oct 14, 2024

It's actually pretty easy to split such PR if you know a few git tricks

git reset HEAD^ --hard
mv dev/tests_common . 
git add .
git commit --no-verify -m "'Move tests common without changes"
git cherry-pick <previous branch tip from git reflog>
git add tests_common
git commit --no-verify -m "Fix docstrings in tests_common"
git add . 
git commit # copy-paste previous commit message

@potiuk
Copy link
Member Author

potiuk commented Oct 14, 2024

BTW. When you look at it now - it does not necessarily make it easier to review in GH UI, but git diff HEAD^ is actually pretty reviewable - even if long. (and the first two commits are easily reviewable).

@potiuk
Copy link
Member Author

potiuk commented Oct 14, 2024

OK. Looks like we are getting there. There were few other places where there get_test_run had one more empty line.

BTW. That's the first time I started to like sphinx - this time it actually prevented us from accidental error that would have potentially caused silently slower-running system tests.

(well @jscheffl to notice it with review but sphinx also guarding it).

airflow/models/dag.py Outdated Show resolved Hide resolved
Follow-up after #42505 fixing teething problem with tests_common.

Originally in #42505 common test code was moved to "dev" folder, but
the "dev" folder is really dedicated to "build" scripts and the
problem with moving "tests_common" to the folder was that the
whole "dev" folder is replaced (for non-committer PRs) with
the content from the target branch.

This is done for security reasons, because we can accidentally
use any of the scripts from dev in the CI build scripts and we
might not notice, which will open us to a security issue where
a file in "dev" coming from PR could be accidentally executed
during the "pull_request_target" workflow - which would expose
our secrets and GitHub Package write permissions to a
contributor coming from a fork.

This change moves the files, fixes pre-commit specification and
docs, also fixes a number of "doc" issues detected by "ruff" in
the tests_common folder as they were detected after the move.
The tests_common folder is added to folders mounted when
breeze is executed with local folders mounted (in order to
avoid accidental mounting of randomly generated files to
inside the breeze container).

All imports for the common tests were updated to reflect this
move.
@potiuk potiuk merged commit cd1e9f5 into main Oct 15, 2024
79 checks passed
@potiuk
Copy link
Member Author

potiuk commented Oct 15, 2024

Green, reviewed, merged.

@uranusjr uranusjr deleted the move-tests-common branch October 15, 2024 03:29
@gopidesupavan
Copy link
Collaborator

cool, Thank you @potiuk 😄

kaxil added a commit to astronomer/airflow that referenced this pull request Oct 16, 2024
This was missed in apache#42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https:/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https:/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
kaxil added a commit that referenced this pull request Oct 16, 2024
This was missed in #42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https:/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https:/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
R7L208 pushed a commit to R7L208/airflow that referenced this pull request Oct 17, 2024
* Move tests common without changes

* Fix docstrings in tests_common

* Move tests_common from "dev" to top-level.

Follow-up after apache#42505 fixing teething problem with tests_common.

Originally in apache#42505 common test code was moved to "dev" folder, but
the "dev" folder is really dedicated to "build" scripts and the
problem with moving "tests_common" to the folder was that the
whole "dev" folder is replaced (for non-committer PRs) with
the content from the target branch.

This is done for security reasons, because we can accidentally
use any of the scripts from dev in the CI build scripts and we
might not notice, which will open us to a security issue where
a file in "dev" coming from PR could be accidentally executed
during the "pull_request_target" workflow - which would expose
our secrets and GitHub Package write permissions to a
contributor coming from a fork.

This change moves the files, fixes pre-commit specification and
docs, also fixes a number of "doc" issues detected by "ruff" in
the tests_common folder as they were detected after the move.
The tests_common folder is added to folders mounted when
breeze is executed with local folders mounted (in order to
avoid accidental mounting of randomly generated files to
inside the breeze container).

All imports for the common tests were updated to reflect this
move.
R7L208 pushed a commit to R7L208/airflow that referenced this pull request Oct 17, 2024
This was missed in apache#42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https:/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https:/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants