Unexpected type binding when wrapping functions #12919

mailund · 2022-06-01T08:15:18Z

Bug Report

I admit that I do not know if this is a bug, and not just my incomplete understanding of how the type system works, but I have run into some unexpected behaviour, and behaviour where mypy and pyre disagree, so it might be a bug.

I am writing code that lifts functions T -> R to Optional[T] -> Optional[R] in various ways, and when I write wrapper functions, type variables get bound in ways I didn't expect. It is particularly bad when I add protocols for various types on top of it, but I can create the unexpected behaviour without it. Basically, the issue is that if I write a function that changes a Callable[[T], R] into a Callable[[Optional[T]], Optional[R]], and then apply such a function to another generic function Callable[[T],R] the type variables are no longer generic, and I cannot apply the wrapped function. This happens both when I work with wrapper functions or when I implement the wrapper as a Generic class.

To Reproduce

If I write a wrapper with a function, it might look like this:

from typing import (
    TypeVar, 
    Protocol, Generic, 
    Optional as Opt
)


_T = TypeVar('_T')

class _F(Protocol[_T]):
    def __call__(self, __x: _T) -> _T:
        ...

# Lifting with a function...
def lift(f: _F[_T]) -> _F[Opt[_T]]:
    # This function would do more, but I just return f for the
    # type checking example
    return f # type: ignore

I get the desired behaviour if I wrap a function with a concrete type

def f(x: int) -> int:
    return x

reveal_type(lift(f))  # _F[Opt[int]] -- as expected

but if I use a generic type, the wrapped function is not generic:

def g(x: _T) -> _T:
    return x

reveal_type(lift(g)) # _F[Union[_T`-1,None]] -- _T is bound (incorrectly?)
                     # pyre thinks this is _F[None] which is also odd...
lift(g)(1)  # so won't accept int

Expected Behavior

I was expecting the wrapped function to have the type Callable[[Opt[_T]], Opt[_T]] and in particular I would expect to be able to call it with an int argument.

To Reproduce

If I use a Generic class for the lifted function, it can look like this:

# Moving generic type to Generic...
class lifted(Generic[_T]):
    """Lift type _T to Opt[_T]."""

    def __init__(self, f: _F[_T]) -> None:
        self._f = f

    def __call__(self, x: Opt[_T], /) -> Opt[_T]:
        """Call lifted."""
        return self._f(x) # type: ignore

reveal_type(lifted[_T])  # _F[T?] -> lifted[_T?]

and it will work for a function with a concrete type, like f: int -> int above

reveal_type(lifted(f))          # lifted[int] -- perfect
reveal_type(lifted(f).__call__) # Opt[int] -> Opt[int] -- perfect
lifted(f)(1)                    # We can call with int
lifted(f)(None)                 # We can call with None

but things get crazy when I use a generic function like g: _T -> _T:

reveal_type(lifted(g))            # lifted[_T`1] -- _T is bound!
                                  # pyre thinks (correctly I think) it is lifted[_T]
reveal_type(lifted(g).__call__)   # Opt[_T`1] -> Opt[_T`1]
                                  # pyre says Any -> Any which I suppose is fine
lifted(g)(1)                      # incompatible type, int isn't an Opt[_T]
                                  # pyre seems to accep this one
lifted(g)(None)                   # but at least None is in Opt[_T]...

reveal_type(lifted[_T](g))          # lifted[_T?] -- Ok
reveal_type(lifted[_T](g).__call__) # def (None) ???
                                    # pyre: Opt[_T] -> Opt[_T] as expected

lifted[_T](g)(1)             # incompatible type "int"; expected "None"
lifted[_T](g)(None)          # but at least None works...
# Pyre doesn't like the [_T] here but accepts the calls without it...

Expected Behavior

Again, I would expect the Generic lifted class to accept types that match the wrapped function's type. This is more important when I need to add protocols to the types, but I would also expect it to work here.

Your Environment

I tested this in mypy playground and pyre playground

The text was updated successfully, but these errors were encountered:

erictraut · 2022-06-01T15:33:06Z

Type variables are "solved" in the context of a call expression that targets the constructor of a generic class or a generic function. Only the type variables in the scope of that class or function are solved. Once a type variable goes out of scope, it can no longer be solved.

The expression listed(g) causes mypy to solve the TypeVar _T@lifted (i.e. _T in the scope of class lifted). The solution in this case is _T@g, which means that listed(g) evaluates to type lifted[_T@g]. However, at this point, _T@g is no longer in scope. It's an unsolved TypeVar from the scope of function g. It is not associated with the lifted class or its __call__ method, which means it cannot be solved in subsequent call expressions like lifted(g)(1).

FWIW, pyright largely agrees with mypy in this case.

So I don't think this is a bug in mypy. It's expected behavior.

mailund · 2022-06-01T18:36:19Z

Ok, I think I understand what you are saying, @erictraut, but let me just check. If I use a generic function or class with a type variable, I bind it when I apply it, and it is no longer functioning as a type variable. If I want the wrapped function to work, the wrapping has to know the concrete type.

So, for example, with a curry function (as a Generic)

from typing import (
    TypeVar, ParamSpec,
    Callable as Fn,
    Optional as Opt,
    Generic,
)

_T = TypeVar('_T')

class curry(Generic[_T]):
    def __init__(self, f: Fn[[_T, _T], _T]) -> None:
        self._f = f
    def __call__(self, x: _T) -> Fn[[_T], _T]:
        return lambda y: self._f(x, y)

I can't do something like

@curry
def f(x: _T, __y: _T) -> _T:
    return x

because _T isn't free after wrapping, and I cannot do this either

@curry[_T]
def f(x: _T, __y: _T) -> _T:
    return x

either, because I can't use a type variable like this when it is unbound?

I can, however, use it with concrete types

def f(x: _T, __y: _T) -> _T:
    return x

fi = curry[int](f)
reveal_type(fi)                  # curry[int]
reveal_type(fi(1))               # def int -> int
reveal_type(fi(1)(2))            # int

fs = curry[str](f)
reveal_type(fs)                  # curry[str]
reveal_type(fs("foo"))           # str -> str
reveal_type(fs("foo")("bar"))    # str

or I can use _T inside a function that uses the type variable because there it will be bound when I evaluate the function and thus get a concrete type there.

def bind(x: _T, f: Fn[[_T,_T], _T]) -> Fn[[_T], _T]:
    # ok here since _T is bound when we evaluate
    res = curry[_T](f)
    return res(x)
    
b = bind(12, f)
reveal_type(b)  # int -> int

b2 = bind("foo", f)
reveal_type(b2)  # str -> str

As long as I don't use an unbound type variable when instantiating the Generic here, it seems to work. Is that guaranteed, and is that the way to achieve what I tried above?

sterliakov · 2022-06-02T12:14:10Z

Just adding my 2 cents here, because I've faced this problem once before (and ended with very ugly pattern that violates DRY, see below).

Related: #1317

It is understandable why does mypy "solve" this variable earlier than we want. But there is no way to define the following decorator: "given function that takes fixed arguments set (say, f(x, y) that implements min signature, like f(x: _T, y: _T) -> _T), produce a function that has same signature with types Transformed[_T]", where Transformed is expressible with typing primitives and depends only on _T (propagating Optional is just one of possible use cases). It is important that _T is the same for decorated function and result, but we do not know _T value in advance.

I think that the following (very raw) approach should be better:

Given decorator:

from typing import Callable, SupportsInt, TypeVar

_T = TypeVar('_T')
_P = TypeVar('_P', bound=SupportsInt)

def allow_none(f: Callable[[_T], _T]) -> Callable[[_T | None], _T | None]:
    def inner(x: _T | None, /) -> _T | None:
        return None if x is None else f(x)
    return inner

@allow_none
def foo(x: _P) -> _P:
    if x is None:
        raise ValueError
    return x

After producing F[_P | None] type checker should recognize the fact that _P is used both on LHS and RHS of Callable definition. When it happens, _P should go back to F scope, allowing the expected substitution with respect to SupportsInt bound.

The same should happen is resulting type is Generic or Protocol: instead of producing generic class with unsolvable type variable, mypy should keep it unresolved. Now any function that performs similar operation (f(x: G[_T]) -> G[_T | None], where G is any Protocol or Generic) is malformed: when applied to generic with type variable G[_T], it produces G[_T | None] with _T out of scope, so only Any and None are compatible. It is definitely not the desired behavior.

I suppose that no existing code should be broken after such change. The resolution algorithm now produces unusable functions (they don't take expected arguments), so usage of such decorators must be very rare and complex. When decorated function is using concrete types, this approach will change nothing.

The only solution now is to explicitly re-declare type of decorated function. But why are they compatible, if foo(1) is OK while allow_none(_foo)(1) throws mypy error: E: Argument 1 has incompatible type "int"; expected "Optional[_P]"? I don't know, it seems to be inconsistent, LHS type is wider like x: int = 0.1, isn't it?

...  # See example above

def _foo(x: _P) -> _P:
    if x is None:
        raise ValueError
    return x

foo: Callable[[_P | None], _P | None] = allow_none(_foo)

reveal_type(foo)  # N: Revealed type is "def [_P <: typing.SupportsInt] (Union[_P`-1, None]) -> Union[_P`-1, None]"
reveal_type(foo(None))  # N: Revealed type is "None"
reveal_type(foo(1))  # N: Revealed type is "Union[builtins.int, None]"

Fixes #1317 Fixes #5738 Fixes #12919 (also fixes a `FIX` comment that is more than 10 years old according to git blame) Note: although this PR fixes most typical use-cases for type inference against generic functions, it is intentionally incomplete, and it is made in a way to limit implications to small scope. This PR has essentially three components (better infer, better solve, better apply - all three are needed for this MVP to work): * A "tiny" change to `constraints.py`: if the actual function is generic, we unify it with template before inferring constraints. This prevents leaking generic type variables of actual in the solutions (which makes no sense), but also introduces new kind of constraints `T <: F[S]`, where type variables we solve for appear in target type. These are much harder to solve, but also it is a great opportunity to play with them to prepare for single bin inference (if we will switch to it in some form later). Note unifying is not the best solution, but a good first approximation (see below on what is the best solution). * New more sophisticated constraint solver in `solve.py`. The full algorithm is outlined in the docstring for `solve_non_linear()`. It looks like it should be able to solve arbitrary constraints that don't (indirectly) contain "F-bounded" things like `T <: list[T]`. Very short the idea is to compute transitive closure, then organize constraints by topologically sorted SCCs. * Polymorphic type argument application in `checkexpr.py`. In cases where solver identifies there are free variables (e.g. we have just one constraint `S <: list[T]`, so `T` is free, and solution for `S` is `list[T]`) it will apply the solutions while creating new generic functions. For example, if we have a function `def [S, T] (fn: Callable[[S], T]) -> Callable[[S], T]` applied to a function `def [U] (x: U) -> U`, this will result in `def [T] (T) -> T` as the return. I want to put here some thoughts on the last ingredient, since it may be mysterious, but now it seems to me it is actually a very well defined procedure. The key point here is thinking about generic functions as about infinite intersections or infinite overloads. Now reducing these infinite overloads/intersections to finite ones it is easy to understand what is actually going on. For example, imagine we live in a world with just two types `int` and `str`. Now we have two functions: ```python T = TypeVar("T") S = TypeVar("S") U = TypeVar("U") def dec(fn: Callable[[T], S]) -> Callable[[T], S]: ... def id(x: U) -> U: ... ``` the first one can be seen as overload over ``` ((int) -> int) -> ((int) -> int) # 1 ((int) -> str) -> ((int) -> str) # 2 ((str) -> int) -> ((str) -> int) # 3 ((str) -> str) -> ((str) -> str) # 4 ``` and second as an overload over ``` (int) -> int (str) -> str ``` Now what happens when I apply `dec(id)`? We need to choose an overload that matches the argument (this is what we call type inference), but here is a trick, in this case two overloads of `dec` match the argument type. So (and btw I think we are missing this for real overloads) we construct a new overload that returns intersection of matching overloads `# 1` and `# 4`. So if we generalize this intuition to the general case, the inference is selection of an (infinite) parametrized subset among the bigger parameterized set of intersecting types. The only question is whether resulting infinite intersection is representable in our type system. For example `forall T. dict[T, T]` can make sense but is not representable, while `forall T. (T) -> T` is a well defined type. And finally, there is a very easy way to find whether a type is representable or not, we are already doing this during semantic analyzis. I use the same logic (that I used to view as ad-hoc because of lack of good syntax for callables) to bind type variables in the inferred type. OK, so here is the list of missing features, and some comments on them: 1. Instead of unifying the actual with template we should include actual's variables in variable set we solve for, as explained in #5738 (comment). Note however, this will work only together with the next item 2. We need to (iteratively) infer secondary constraints after linear propagation, e.g. `Sequence[T] <: S <: Sequence[U] => T <: U` 3. Support `ParamSpec` (and probably `TypeVarTuple`). Current support for applying callables with `ParamSpec` to generics is hacky, and kind of dead-end. Although `(Callable[P, T]) -> Callable[P, List[T]]` works when applied to `id`, even a slight variation like `(Callable[P, List[T]]) -> Callable[P, T]` fails. I think it needs to be re-worked in the framework I propose (the tests I added are just to be sure I don't break existing code) 4. Support actual types that are generic in type variables with upper bounds or values (likely we just need to be careful when propagating constraints and choosing free variable within an SCC). 5. Add backtracking for upper/lower bound choice. In general, in the current "Hanoi Tower" inference scheme it is very hard to backtrack, but in in this specific choice in the new solver, it should be totally possible to switch from lower to upper bound on a previous step, if we found no solution (or `<nothing>`/`object`). 6. After we polish it, we can use the new solver in more situations, e.g. for return type context, and for unification during callable subtyping. 7. Long term we may want to allow instances to bind type variables, at least for things like `LRUCache[[x: T], T]`. Btw note that I apply force expansion to type aliases and callback protocols. Since I can't transform e.g. `A = Callable[[T], T]` into a generic callable without getting proper type. 8. We need to figure out a solution for scenarios where non-linear targets with free variables and constant targets mix without secondary constraints, like `T <: List[int], T <: List[S]`. I am planning to address at least majority of the above items, but I think we should move slowly, since in my experience type inference is really fragile topic with hard to predict long reaching consequences. Please play with this PR if you want to and have time, and please suggest tests to add.

mailund added the bug mypy got something wrong label Jun 1, 2022

ilevkivskyi mentioned this issue May 23, 2023

Foundations for non-linear solver and polymorphic application #15287

Merged

ilevkivskyi closed this as completed in #15287 Jun 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected type binding when wrapping functions #12919

Unexpected type binding when wrapping functions #12919

mailund commented Jun 1, 2022

erictraut commented Jun 1, 2022

mailund commented Jun 1, 2022

sterliakov commented Jun 2, 2022

Unexpected type binding when wrapping functions #12919

Unexpected type binding when wrapping functions #12919

Comments

mailund commented Jun 1, 2022

erictraut commented Jun 1, 2022

mailund commented Jun 1, 2022

sterliakov commented Jun 2, 2022