Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to pydantic 2 #122

Closed
coretl opened this issue Jul 10, 2024 · 9 comments · Fixed by #127
Closed

Move to pydantic 2 #122

coretl opened this issue Jul 10, 2024 · 9 comments · Fixed by #127
Assignees

Comments

@coretl
Copy link
Contributor

coretl commented Jul 10, 2024

No description provided.

@callumforrester
Copy link
Contributor

From discussion with @coretl:

The main remaining complication is that ScanSpecs and Regions are pydantic dataclasses instead of base models, we did that because we wanted positional arguments. Unfortunately it looks like pydantic v2 dataclasses have more issues than v1 dataclasses.

We're thinking about converting the scanspecs and regions to base models and putting in some sort of shimming to allow positional args.

Options are:

  1. Create helper functions for creating scanspecs, so Line("x", 0, 10, 5) becomes line("x", 0, 10, 5). This adds boilerplate but is the most flexibile, since we could add nice utility functions to create complex, commonly compositions, e.g. Line("x", 0, 10, 5) * Line(y, 0, 10, 5) -> grid_scan("x", 0, 10, 5, "y", 0, 10, 5).
  2. Create a common base class with an overriding init method that takes positional args. This is the simplest approach and adds a minimal amount of magic
  3. Some combination of the above

@callumforrester
Copy link
Contributor

@coretl
Copy link
Contributor Author

coretl commented Jul 19, 2024

In order of preference:

  1. Make the pydantic.dataclass implementation that was already there work with v2
  2. Provide some magic in the Spec baseclass that does this
  3. Ditch positional args, and provide helpers like grid_scan that re-introduce them

3 was mostly rejected because of the sheer volume of tests and docs that use positional args

2 could be accomplished messily at runtime only (type checker will still moan that it doesn't support positional args):

from pydantic import BaseModel, Field

class Spec(BaseModel):
    @classmethod
    def __pydantic_init_subclass__(cls, **kwargs):
        original_init = cls.__init__
        def patched_init(self, *args, **kwargs):
            for k, v in zip(cls.model_fields, args):
                kwargs[k] = v
            original_init(self, **kwargs)
        cls.__init__ = patched_init
    

class Line(Spec):
    start: float = Field(description="Midpoint of the first point of the line")
    stop: float = Field(description="Midpoint of the last point of the line")
    num: int = Field(min=1, description="Number of frames to produce")


obj = Line(3, 4, 5)

1 would be the neatest solution if someone could make it work...

@dperl-dls
Copy link

dperl-dls commented Jul 22, 2024

One of the main issues is that pydantic.dataclasses.dataclass creates a cls.__pydantic_validator__ according to a dataclass schema in pydantic._internal._dataclasses.py:181, which replaces whatever happen(s/ed) in scanspec.core._discriminated_union_of_subclasses, and also replaces the __init__ with validation using this validator. This then passes up a TypeError somewhere in the Rust validation part, because (at least as far as is reported) its own internal _pydantic_core.ArgsKwargs can't be converted to a PyTuple - if one inspects a simple dataclass this clearly isn't true, but it's not very easy to inspect what validator is doing in pydantic_core.validators.dataclass.rs (probably in line 632).

I think fixing this by post-dataclass-initialisation hacking over the validator is possible, but would be really messy and fragile, and I think option 3 is actually what would end up with the most maintainable, correctly typed result in the end

@callumforrester
Copy link
Contributor

I agree that messing with pydantic under the bonnet is increasingly fragile. Happy with 2. or 3.

@dperl-dls do you have a particular objection to 2.?

@dperl-dls
Copy link

I have no specific objection to it as such, I just think it might not be that easy.

Anyway, I might have to give up on this, I've tried a few different ways for a few days and I can't really get it to work. Starting from @evalott100 's WIP I have a branch here:
https:/bluesky/scanspec/tree/122_move_to_pydanticv2_option_3

The main problem I'm running into is that I'm finding it impossible to rebuild the model on finalising without pydantic complaining that I've provided multiple options for each tag

BTW when researching I found this discussion: pydantic/pydantic#8789 which would have been useful, so linking here for reference

@coretl
Copy link
Contributor Author

coretl commented Jul 29, 2024

Just to say I spent a couple of hours on this today and made a bit of progress. The dataclass approach seems to work with some hacky workarounds (I can prod the __pydantic_fields__ attribute of the dataclass and then deserialize works, but I need to call serialize_as_any to get serialize to work). I experimented with rebuild_dataclass but no joy yet. @gilesknap mentioned something about deferred union definitions possibly being a thing, so will have more of a look tomorrow.

@coretl
Copy link
Contributor Author

coretl commented Jul 31, 2024

Ok, I now have a working dataclass implementation here:
https:/coretl/pydantic-tagged-dc

@ZohebShaikh will take this ticket on to apply the strategy in this repo to scanspec

There is one remaining bug, if all Regions are created before Mask (which refers to Region) then it won't get the right Region tagged union. This should make some tests fail. I will try and fix that in this standalone repo and then that should fix the remaining tests

@coretl coretl assigned ZohebShaikh and unassigned evalott100 and dperl-dls Jul 31, 2024
@coretl
Copy link
Contributor Author

coretl commented Jul 31, 2024

I've fixed that issue, and I've also simplified and commented while I'm at it:
coretl/pydantic-tagged-dc#3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants