Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: autogenerate addcolumn migrations #6053

Merged
merged 17 commits into from
Jun 28, 2024

Conversation

kylemumma
Copy link
Member

@kylemumma kylemumma commented Jun 24, 2024

the entry point is main.py:generate which gets called by a very thin layer over the cli

note this PR will work as it is but u may need to delete the generated migration before re-running, it depends on #6047 before its ready for real-user use.

major additions from the last pr

  1. diff.py:_migration_ops_to_migration

In the last PR we had this:
(oldstorage, newstorage) --> MigrationOperations (ex. AddColumn)
now we have:
(oldstorage, newstorage) --> MigrationOperations --> migration.py (as a string)
this added transformation step is done by the new migration_ops_to_migration function

  • a few new/redefined repr functions on various objects (ex. AddColumn, utils.shemas, etc)

The new function migration_ops_to_migration is a transformation between python objects and python code. In order to implement it I had to make use of repr functions which required adding and redefining some of them.

  1. main.py:write_migration

This new function takes the generated migration.py-as-a-str and writes it to a python file at the correct location in the local repository.

  1. optional --name flag
    this was a very small change to the cli that didn't change too much, but users can now pass in an optional --name flag which affects the name of the python file that gets generated.

testing

  • test_diff --> test_generate_python_migration
    in the last pr test_diff was used to test (oldstorage, newstorage) --> MigrationOperations
    it was renamed to test_generate_python_migration and now tests
    (oldstorage, newstorage) --> migration.py (as a string)

Note, I made the decision not to create end-to-end tests like:
snuba migration generate path/to/storage --> correctly written .py file in the right place in the filesystem
I made this decision because I didnt want to create fake dirs and git repos and storages and use subprocess to do cli etc.

Instead I tested as unit tests which i think is ok because the connections between the units are very loose and imo safe and i verified by hand that the pipeline as a whole works.
The units I tested:

how to step through

you can use the unit tests, or you can do a workflow like:

  1. add a column to the storage definition snuba/snuba_migrations/events/storages/errors.yaml
  2. run snuba migrations generate snuba/snuba_migrations/events/storages/errors.yaml
  3. use pdb to step through that command running

@kylemumma kylemumma force-pushed the krm/autogenerate-add-migration-3 branch from 472faaa to b2c0184 Compare June 25, 2024 15:20
"""


def _is_valid_add_column(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only change here is that input is now a yaml.safe_load dict instead of a yaml str, logically no real changes

@kylemumma kylemumma marked this pull request as ready for review June 25, 2024 16:01
@kylemumma kylemumma requested a review from a team as a code owner June 25, 2024 16:01
Copy link
Member

@evanh evanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me!

definition for the migration as a str. The migration must be non-blocking.
"""

forwards_str = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below code can be extracted into a separate helper function and forward_str and backwards_str is constructed using the helper function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the benefit of doing helper functions? (this one and the other one)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below code is converting an sequence of opertations into a string and it was repeated for both add and drop columns. This could be potentially same for modify columns. (this may be NIT)

For the second comment, Breaking out string generation into helper functions makes the main code more readable by reducing clutter and focusing on core logic.
consider refactoring into a parameterized helper function that generates variations of a static string using provided parameters. Isolated helper functions are easier to test, ensuring that the string outputs are correct and reducing the likelihood of bugs.

"[" + ", ".join([f"operations.{repr(op)}" for op in backwards_ops]) + "]"
)

return f"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we extract the below code into a separate helper functions which take forwards_str and backwards_str and return the generated migration class.

@kylemumma kylemumma force-pushed the krm/autogenerate-add-migration-3 branch from b2c0184 to 9a50f40 Compare June 28, 2024 16:26
@kylemumma kylemumma merged commit 1ccd0e3 into master Jun 28, 2024
28 checks passed
@kylemumma kylemumma deleted the krm/autogenerate-add-migration-3 branch June 28, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants