From c13be2dad311c457215eb55dd378dee1949c5e8e Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 13 Sep 2022 13:54:06 -0700 Subject: [PATCH 01/19] Proposal: associated constant assignment versus equality --- proposals/p2173.md | 421 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 421 insertions(+) create mode 100644 proposals/p2173.md diff --git a/proposals/p2173.md b/proposals/p2173.md new file mode 100644 index 0000000000000..f5568f3510f95 --- /dev/null +++ b/proposals/p2173.md @@ -0,0 +1,421 @@ +# Associated constant assignment versus equality + + + +[Pull request](https://github.com/carbon-language/carbon-lang/pull/2173) + + + +## Table of contents + +- [Abstract](#abstract) +- [Problem](#problem) + - [Equal types with different interfaces](#equal-types-with-different-interfaces) + - [Type canonicalization](#type-canonicalization) + - [Transitivity of equality](#transitivity-of-equality) + - [Coherence](#coherence) +- [Background](#background) +- [Proposal](#proposal) +- [Details](#details) + - [Rewrite constraints](#rewrite-constraints) + - [Combining constraints with `&`](#combining-constraints-with-) + - [Same type constraints](#same-type-constraints) + - [Observe declarations](#observe-declarations) + - [Implementing an interface](#implementing-an-interface) + - [Ergonomics](#ergonomics) +- [Rationale](#rationale) +- [Alternatives considered](#alternatives-considered) + + + +## Abstract + +Split the `where A == B` constraint in two: `where .A = B` produces a new +constraint from an old one, where the value of `.A` in the new constraint is +known and eagerly rewritten to `B`, and `where A == B`, which does not cause `A` +and `B` to be considered as identical by language rules but does permit implicit +(no-op) conversion between them. + +This aims to provide an efficiently-computable and human-understandable type +equality rule, with type canonicalization and therefore transitive type +equality, without sacrificing too much in the way of ergonomics and without +sacrificing determinism, while still providing the full power of a general type +constraint system in a less ergonomic form. + +## Problem + +### Equal types with different interfaces + +When an associated type is constrained to be a concrete type, it is desirable +for the associated type to behave like that concrete type: + +``` +interface Hashable { + fn Hash[me: Self]() -> u128; +} +interface X { + let R:! Hashable; + fn Make() -> R; +} +fn F[T:! X where .R == i32]() -> i32 { + return F.Make() + 1; +} +``` + +Here, we want to be able to use the result of `F.Make()` in all regards like an +`i32`. Assuming an `i32.Abs` function exists, and `i32` implements `Hashable` +externally, it is more desirable for `F.Make().Abs()` to work than it is for +`F.Make().Hash()` to work. + +We are currently aiming to address this by saying that when two type expressions +are constrained to be equal, a value whose type is either of those type +expressions provides the (disjoint) union of their interfaces. This leads to a +surprise: + +``` +fn G[T:! X where .R == i32]() -> u128 { + let n: i32 = 1; + let m: i64 = 2; + return n.Hash() ^ m.Hash(); +} +``` + +Here, the call to `n.Hash()` is valid but the call to `m.Hash()` is invalid, +because the type `i32` of `n` is constrained to be equal to `T.R` which is of +type `Hashable`, but the type `i64` is not constrained in that way and does not +implement `Hashable` internally. + +This suggests that there might be two different behaviors that we want when +adding a constraint: either we are changing the interface, or not. This is +analogous to internal versus external implementation of an interface. + +### Type canonicalization + +There are a variety of contexts in which types are compared for equality. In +some of those contexts, it might be acceptable to perform a search and a +computation, and it might be acceptable to have false negatives. For example, +when analyzing an argument being passed to a function in a generic, it might be +OK to say "we can't prove that the argument has a type that is equal to (or +convertible to) the type of the parameter", so long as the developer has some +way to demonstrate that the types are the same if that is indeed the case. + +However, not all comparisons of generic types have that property. As an extreme +and hopefully avoidable example, suppose a C++-like linkage model is used, with +name mangling. We may need to determine whether two type expressions are equal +at link time, by reducing them both to strings that are guaranteed to be +identical if the types are equal and non-identical otherwise. This requires some +form of type canonicalization: reducing all equal type expressions to some +common representation where we can faithfully perform equality comparisons. + +Even if the semantics of Carbon don't require canonicalization, it is still a +very useful property for implementation purposes. For example, if type +canonicalization is possible to perform efficiently, building a data structure +mapping from a type to some value becomes much simpler, and if the +representation of a type expression carries its canonical form then type +equality comparisons can be done in constant time. + +### Transitivity of equality + +For ergonomic purposes, it is valuable for the notion of type expression +equality to be transitive: if type A is known to be the same as type B, and type +B is known to be the same as type C, then shouldn't we be able to use an A where +a C is required? See +[#1369](https://github.com/carbon-language/carbon-lang/issues/1369) for an +explanation of why this is important. + +However, if we permit arbitrary constraints and have a transitive equality rule, +then type equality is isomorphic to the +[word problem for finitely-presented monoids](https://en.wikipedia.org/wiki/Word_problem_for_groups), +which is undecidable, and certainly not computable, in general. + +Therefore we must impose a restriction somewhere: either not all constraints are +permissible, or we do not have transitivity, or we impose some other constraint +on the language to avoid the hard cases. Some languages merely impose some kind +of depth limit on their search for a proof of equality. + +Note that if we can perform type canonicalization, then type equality is +necessarily transitive: if all equal _pairs_ of types canonicalize to the same +representation, then _all_ equal types canonicalize to the same representation +transitively. + +### Coherence + +In order to avoid surprises in the language behavior, we want Carbon to have the +property that learning more about the type of a value should not cause the +behavior of an expression to change from one valid behavior to another valid +behavior. + +It's important to distinguish here between learning more about a type, such as +learning that it implements a particular interface – for example, from an added +import – and changing the type, such as by writing a different expression after +a `:` or `:!`. Changing the type of a value is expected to be able to change the +behavior of an expression using that value. + +## Background + +TODO: Lots of past generics work to discuss here. + +## Proposal + +Replace the existing `where A == B` constraint with two different kinds of +constraint: + +- `where .A = T` specifies a _rewrite_ constraint. If the associated constant + `.A` is named as a member of a type with this constraint, it is rewritten to + `T`. This means that the interface of the type changes, and the behavior is + in all respects as if `T` had been specified directly. +- `where X == Y` specifies a _same-type_ constraint. This is treated much like + `where X is SameAs(Y)`. Here, `SameAs` is a hypothetical constraint that is + reflexive (`T is SameAs(T)` for all `T`), commutative (`T is SameAs(U)` if + and only if `U is SameAs(T)`), and not implementable by the developer. + Moreover, if `F` is any pure type function, `T is SameAs(U)` implies + `F(T) is SameAs(F(U)`). `observe` statements can be used to form transitive + `SameAs` relations. Same-type constraints are not used automatically by the + language rules for any purpose, but there is an `ImplicitAs(T)` impl for + types implementing `SameAs(T)`. + +A type implements `C where .A = T` if and only if it implements +`C where .A == T` – the inhabitants of these two type-of-types are the same, but +they are different type-of-types and provide different interfaces for +corresponding type values. + +## Details + +### Rewrite constraints + +In a rewrite constraint, the left-hand operand of `=` must be the dotted name of +an associated constant. This notation is permitted anywhere a constraint can be +written, and results in a new constraint with a different interface: the named +member effectively no longer names an associated constant of the constrained +type, and is instead treated as a rewrite rule the expands to the right-hand +side of the constraint, with any mentioned parameters substituted into that +type. + +``` +interface Container { + let Element:! Type; + let Slice:! Container where .Element = Element; +} +// `T.Slice.Element` rewritten to `T.Element` because type of `T.Slice` says `.Element = Element`. +// `T.Element` rewritten to `i32` because type of `T` says `.Element = i32`. +fn A[T:! Container where .Element = i32](x: T.Slice.Element) {} +``` + +Rewrites aren't performed on the left-hand side of such an `=`, so +`where .A = .B and .A = C` is not rewritten to `where .A = .B and .B = C`. +Instead, such a `where` clause is invalid unless each rule for `.A` specifies +the same rewrite. + +When an `=` appears within an interface, it is possible to form a cycle where a +type becomes more complex with each iteration. In order to avoid this problem, +if, when substituting into the right-hand side of such an `=`, the same rewrite +is performed again, the program is invalid. + +``` +interface Container { + let Element:! Type; + // (1), bad constraint + let Slice:! Container where .Element = .Slice.Element; +} +// Error, `T.Slice.Element` rewritten to +// `T.Slice.Slice.Element` by (1), then to +// `T.Slice.Slice.Slice.Element` by (1) +fn A[T:! Container](x: T.Slice.Element) {} +``` + +It is valid for the same rewrite rule to be used multiple times within a type +expression, so long as it is not used during substitution into its own +right-hand side: + +``` +interface BInterface { + let C:! Type; +} +interface AInterface { + let D:! AInterface; + let B:! BInterface where .C = Self.D; +} +// `A.B.C` is rewritten to `Self.D`, because the type of `A.B` says `.C = Self.D`. +// `Self`=`A` is substituted into this giving `A.D`. +// During this substitution, the same rewrite rule is not used again, +// so this is OK. +// Then `A.D.B.C` is rewritten to `Self.D`, because the type of `A.D.B` says +// `.C = Self.D`. +// `Self`=`A.D` is substituted into this giving `A.D.D`. +// During this substitution, the same rewrite rule is not used again, +// so this is also OK, even though the same rule was used twice +// in forming the overall type. +fn F(A:! AInterface, d: A.B.C.B.C) {} +``` + +Rewrite constraints thereby give us a deterministic, terminating type +canonicalization mechanism for associated constants: in `A.B`, if the type of +`A` specifies that `.B == C`, then `A.B` is replaced by `C`. Equality of types +constrained in this way is transitive. + +Note that given `T:! C where .R = i32` can result in a type `T.R` whose behavior +is different from the behavior of `T.R` given `T:! C`. For example, member +lookup into `T.R` can find different results and operations can therefore have +different behavior. However, this does not violate coherence because `C` and +`C where .R = i32` don't differ by merely having more type information; rather, +they are different types that have an isomorphic set of values, somewhat like +`i32` and `u32`. An `=` constraint is not merely learning a new fact about a +type, it is requesting different behavior. + +### Combining constraints with `&` + +Suppose we have `X = C where .R = A` and `Y = C where .R = B`. What should +`C & X` produce? What should `X & Y` produce? + +We could perhaps say that `X & Y` results in a constraint where the type of `R` +has the union of the interface of `A` and the interface of `B`, and that `C & X` +similarly results in a constraint where the type of `R` has the union of the +interface of `A` and the interface originally specified by `C`. However, this +proposal suggests a simpler rule: + +- Combining two rewrite rules with different rewrite targets results in a + constraint where the associated constant is ambiguous. Given `T:! X & Y`, + the type expression `T.R` is ambiguous between a rewrite to `A` and a + rewrite to `B`. But given `T:! X & X`, `T.R` is unambiguously rewritten to + `A`. +- Combining a constraint with a rewrite rule with a constraint with no rewrite + rule preserves the rewrite rule. For example, supposing that + `interface Container` extends `interface Iterable`, and `Iterable` has an + associated constant `Element`, the constraint + `Container & (Iterable where .Element = i32)` is the same as the constraint + `(Container & Iterable) where .Element = i32` which is the same as the + constraint `Container where .Element = i32`. + +### Same type constraints + +A same-type constraint describes that two type expressions are known to evaluate +to the same value. Unlike a rewrite constraint, however, the two type +expressions are treated as distinct types when type-checking a generic that +refers to them. + +Same-type constraints are brought into scope, looked up, and resolved exactly as +if there were a `SameAs(U:! Type)` interface and a `T == U` impl corresponded to +`T is SameAs(U)`, except that `==` is commutative. As such, it's not possible to +ask for a list of types that are the same as a given type, nor to ask whether +there exists a type that is the same as a given type and has some property. But +it is possible to ask whether two types are (non-transitively) the same. + +In order for same-type constraints to be useful, they must allow the two types +to be treated as actually being the same in some context. This can be +accomplished by the use of `==` constraints in an `impl`, such as in the +built-in implementation of `ImplicitAs`: + +``` +final impl forall [T:! Type, U:! Type where .Self == T] T as ImplicitAs(U) { + @Carbon.Builtin("implicit_as_same_type") + fn Convert[me: Self](other: U) -> U; +} +``` + +It superficially seems like it would be convenient if such implementations were +made available implicitly – for example, by writing +`impl forall [T:! Type] T as ImplicitAs(T)` – but in more complex examples that +turns out to be problematic. Consider: + +``` +interface CommonTypeWith(U:! Type) { + let Result:! Type; +} +final impl forall [T:! Type] T as CommonTypeWith(T) where .Result = T {} + +fn F[T:! Potato, U:! Hashable where .Self == T](x: T, y: U) -> auto { + // What is T.CommonTypeWith(U).Result? Is it T or U? + return (if cond then x else y).Hash(); +} +``` + +With this proposal, `impl` validation for `T as CommonTypeWith(U)` fails: we +cannot pick a common type when given two distinct type expressions, even if we +know they evaluate to the same type, because we would not know which API the +result should have. + +### Observe declarations + +Same-type constraints are non-transitive, just like other constraints such as +`ImplicitAs`. The developer can use an `observe` declaration to bring a new +same-type constraint into scope: + +``` +observe A == B == C; +``` + +notionally does much the same thing as + +``` +impl A as SameAs(C) { ... } +``` + +where the `impl` makes use of `A is SameAs(B)` and `B is SameAs(C)`. + +### Implementing an interface + +When implementing an interface, the `=` syntax must be used, and each associated +constant without a default must be explicitly assigned-to: + +``` +impl IntVec as Container where .Element = i32 { ... } +``` + +not + +``` +impl IntVec as Container where .Element == i32 { ... } +``` + +The latter would be treated as + +``` +impl IntVec as Container where IntVec.Element is SameAs(i32) { ... } +``` + +... which only checks the value of `Element` rather than specifying it. + +As in other contexts, `Self.Element` is rewritten to `i32` within the context of +the `impl`. + +### Ergonomics + +This proposal can be viewed as dividing type constraints into two categories: + +- Ergonomic, but quite restricted, constraints using `=`. +- Non-ergonomic, but fully general, constraints using `==`. + +In order for this to be an effective and ergonomic strategy, we require both +that the difference between these two kinds of constraints are readily +understood by developers, and that the more-ergonomic `=` constraints cover +sufficiently many scenarios that the developer seldom needs to write code to +request a conversion between types that are known to be the same. + +TODO: Case studies? + +## Rationale + +TODO: How does this proposal effectively advance Carbon's goals? Rather than +re-stating the full motivation, this should connect that motivation back to +Carbon's stated goals and principles. This may evolve during review. Use links +to appropriate sections of [`/docs/project/goals.md`](/docs/project/goals.md), +and/or to documents in [`/docs/project/principles`](/docs/project/principles). +For example: + +- [Community and culture](/docs/project/goals.md#community-and-culture) +- [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem) +- [Performance-critical software](/docs/project/goals.md#performance-critical-software) +- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) +- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write) +- [Practical safety and testing mechanisms](/docs/project/goals.md#practical-safety-and-testing-mechanisms) +- [Fast and scalable development](/docs/project/goals.md#fast-and-scalable-development) +- [Modern OS platforms, hardware architectures, and environments](/docs/project/goals.md#modern-os-platforms-hardware-architectures-and-environments) +- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code) + +## Alternatives considered + +TODO: Primarily compare against status quo. From 6b83179b1b9364043bd6a44341601a08437eb96c Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 13 Sep 2022 14:34:55 -0700 Subject: [PATCH 02/19] Fill in remaining sections. --- proposals/p2173.md | 110 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 89 insertions(+), 21 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index f5568f3510f95..175a4dfc2b6c2 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -29,6 +29,9 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Ergonomics](#ergonomics) - [Rationale](#rationale) - [Alternatives considered](#alternatives-considered) + - [Status quo](#status-quo) + - [Restrict constraints to allow computable type equality](#restrict-constraints-to-allow-computable-type-equality) + - [Different names for same-type constraint](#different-names-for-same-type-constraint) @@ -71,10 +74,11 @@ Here, we want to be able to use the result of `F.Make()` in all regards like an externally, it is more desirable for `F.Make().Abs()` to work than it is for `F.Make().Hash()` to work. -We are currently aiming to address this by saying that when two type expressions -are constrained to be equal, a value whose type is either of those type -expressions provides the (disjoint) union of their interfaces. This leads to a -surprise: +We are +[currently aiming to address this](https://github.com/carbon-language/carbon-lang/pull/2070) +by saying that when two type expressions are constrained to be equal, a value +whose type is either of those type expressions provides the (disjoint) union of +their interfaces. This leads to a surprise: ``` fn G[T:! X where .R == i32]() -> u128 { @@ -157,7 +161,19 @@ behavior of an expression using that value. ## Background -TODO: Lots of past generics work to discuss here. +There has been a lot of work in this space. Some previous relevant proposals: + +- [#731: generics details 2](https://github.com/carbon-language/carbon-lang/pull/731) + introduced associated types, with the intent that a mechanism to constrain + their value would be provided in a later proposal. +- [#818: generics details 3](https://github.com/carbon-language/carbon-lang/pull/818) + introduced `where` constraints for generics, with an open question for + whether and how we might restrict constraints in order to give a decidable + type equality rule. #818 addressed that question by using a non-transitive + rule with `observe` declarations to close the transitivity gaps. +- [#2070: always `==` not `=` in `where` clauses](https://github.com/carbon-language/carbon-lang/pull/2070) + describes our previous approach to the "equal types with different + interfaces" problem. ## Proposal @@ -395,27 +411,79 @@ understood by developers, and that the more-ergonomic `=` constraints cover sufficiently many scenarios that the developer seldom needs to write code to request a conversion between types that are known to be the same. -TODO: Case studies? - ## Rationale -TODO: How does this proposal effectively advance Carbon's goals? Rather than -re-stating the full motivation, this should connect that motivation back to -Carbon's stated goals and principles. This may evolve during review. Use links -to appropriate sections of [`/docs/project/goals.md`](/docs/project/goals.md), -and/or to documents in [`/docs/project/principles`](/docs/project/principles). -For example: - -- [Community and culture](/docs/project/goals.md#community-and-culture) - [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem) -- [Performance-critical software](/docs/project/goals.md#performance-critical-software) + - This rule is easy to implement, and even a naive implementation should + be efficient. - [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) + - The status quo causes coincidentally-equal types to have the same + interface. Under this proposal, the interface of a type is not affected + by coincidental equality, only by intentional assignment to an + associated type. - [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write) -- [Practical safety and testing mechanisms](/docs/project/goals.md#practical-safety-and-testing-mechanisms) -- [Fast and scalable development](/docs/project/goals.md#fast-and-scalable-development) -- [Modern OS platforms, hardware architectures, and environments](/docs/project/goals.md#modern-os-platforms-hardware-architectures-and-environments) -- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code) + - Using `=` rather than `==` in `impl`s is easier to read and write. + - The interface available on a type is more predictable in this proposal + than in the status quo ante, making code easier to understand. ## Alternatives considered -TODO: Primarily compare against status quo. +### Status quo + +The [Problem](#problem) section describes some concerns with the status quo +approach. This proposal aims to address those concerns. + +### Restrict constraints to allow computable type equality + +We could somehow restrict which constraints can be specified, in order to ensure +that our type equality rule is efficiently computable, perhaps even in a way +that permits a deterministic computation of a canonical type. The exact details +of how we would do this have not been worked out, but this might lead to a +coherent rule that has transitive type equality. However, this would not solve +the "equal types with different interfaces" problem. + +### Different names for same-type constraint + +We considered various options for the spelling of a same-type constraint: + +- `where A == B` +- `where A ~ B` +- some other operator symbol +- `where A is SameType(B)`, or some other constraint name + +The advantage of `==` is that it has a lot of desirable implications: it's +familiar, symmetric, and connotes the left and right operand being the same. +However, it also might be taken as suggesting that the `==` overloaded operator +is used to determine equality. A different spelling would also indicate that +this is an unusual constraint, and would be easier to search for. Concerns were +also raised that `==` might suggest transitivity if taken to mean "same value" +rather than something like the Carbon `==` binary operator, and the transitivity +of `==` constraints is not automatic. + +The `~` operator is currently not in use as a binary operator. However, that +makes it quite valuable syntactic real estate, and it may see little use in this +context in practice. A longer operator could be used, but any choice other than +`==` will present an unfamiliarity barrier, and longer operators will be harder +to remember. + +A named constraint is appealing, but this operator does not behave like a named +constraint in that it is automatically symmetrized, not implementable, and +especially in its interactions with `observe` declarations. It is unclear how +`observe` declarations would be written with a named `SameType` constraint: + +``` +// Maybe something like this special-case and verbose syntax? +observe A is SameType(B) and B is SameType(C) => A is SameType(C); +``` + +The fundamental connection between same-type constraints and `observe` suggests +that dedicated syntax is justified. + +For now, we use `==`, despite it having a different meaning in the context of a +constraint than in expression contexts. This is analogous to how `=` and `and` +have different meanings in this context from in expressions, so the behavior +divergence is at least consistent. It's also not clear at this stage how much +usage same-type constraints will have, compared to rewrite constraints, so it's +hard to judge the other arguments in favor of and against other operators. It +would be reasonable to revisit this decision when we have more usage +information. From 5de17e2c53a3b351d610fe733ac9a0b55c6aa6d6 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Sep 2022 11:01:14 -0700 Subject: [PATCH 03/19] Fix typo. Co-authored-by: josh11b --- proposals/p2173.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 175a4dfc2b6c2..26f4c44e1804c 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -207,7 +207,7 @@ In a rewrite constraint, the left-hand operand of `=` must be the dotted name of an associated constant. This notation is permitted anywhere a constraint can be written, and results in a new constraint with a different interface: the named member effectively no longer names an associated constant of the constrained -type, and is instead treated as a rewrite rule the expands to the right-hand +type, and is instead treated as a rewrite rule that expands to the right-hand side of the constraint, with any mentioned parameters substituted into that type. From 591a1af5ef26f6300e2eada5d593ae0bc61bb88d Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Sep 2022 15:21:42 -0700 Subject: [PATCH 04/19] Apply suggestions from code review --- proposals/p2173.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 26f4c44e1804c..5d7bf0032a3b5 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -203,8 +203,9 @@ corresponding type values. ### Rewrite constraints -In a rewrite constraint, the left-hand operand of `=` must be the dotted name of -an associated constant. This notation is permitted anywhere a constraint can be +In a rewrite constraint, the left-hand operand of `=` must be a `.` followed by +the name of an associated constant. `.Self` is not permitted. +This notation is permitted anywhere a constraint can be written, and results in a new constraint with a different interface: the named member effectively no longer names an associated constant of the constrained type, and is instead treated as a rewrite rule that expands to the right-hand @@ -216,8 +217,10 @@ interface Container { let Element:! Type; let Slice:! Container where .Element = Element; } -// `T.Slice.Element` rewritten to `T.Element` because type of `T.Slice` says `.Element = Element`. -// `T.Element` rewritten to `i32` because type of `T` says `.Element = i32`. +// `T.Slice.Element` rewritten to `T.Element` +// because type of `T.Slice` says `.Element = Element`. +// `T.Element` rewritten to `i32` +// because type of `T` says `.Element = i32`. fn A[T:! Container where .Element = i32](x: T.Slice.Element) {} ``` From a7cac91534342032406d90dd043e44a84d656f8e Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Sep 2022 18:50:36 -0700 Subject: [PATCH 05/19] Expand and add examples. --- proposals/p2173.md | 146 +++++++++++++++++++++++++++++++++------------ 1 file changed, 109 insertions(+), 37 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 5d7bf0032a3b5..c97855b9024d6 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -169,11 +169,23 @@ There has been a lot of work in this space. Some previous relevant proposals: - [#818: generics details 3](https://github.com/carbon-language/carbon-lang/pull/818) introduced `where` constraints for generics, with an open question for whether and how we might restrict constraints in order to give a decidable - type equality rule. #818 addressed that question by using a non-transitive - rule with `observe` declarations to close the transitivity gaps. + type equality rule. + - #818 represents the status quo of approved proposals. It has both `=` + and `==` where constraints, which are similar to, but somewhat + different from, those in this proposal. + - `=` constraints require a concrete type on their right-hand side. The + left-hand side is not restricted. The type-of-type of the left-hand + side is changed to match the right-hand side. + - `==` constraints do not affect the type-of-type of either operand. + - As an ansewr to the question of how to produce a decidable type + equality rule, both kinds of constraints are applied automatically, but + only at a depth of one constraint. Neither form is transitive but both + can be transitively extended using `observe` declarations. - [#2070: always `==` not `=` in `where` clauses](https://github.com/carbon-language/carbon-lang/pull/2070) - describes our previous approach to the "equal types with different - interfaces" problem. + describes an attempt to unify the syntax and semantics of `=` and `==` + constraints, under which we would always merge the type-of-types of the + operands. That proposal has not yet been accepted, and this proposal + intends to supersede it. ## Proposal @@ -194,10 +206,10 @@ constraint: language rules for any purpose, but there is an `ImplicitAs(T)` impl for types implementing `SameAs(T)`. -A type implements `C where .A = T` if and only if it implements -`C where .A == T` – the inhabitants of these two type-of-types are the same, but -they are different type-of-types and provide different interfaces for -corresponding type values. +A type implements `C where .A = T` if and only if it implements `C where .A == +T`, assuming both constraints are valid – the inhabitants of these two +type-of-types are the same, but they are different type-of-types and provide +different interfaces for corresponding type values. ## Details @@ -205,23 +217,45 @@ corresponding type values. In a rewrite constraint, the left-hand operand of `=` must be a `.` followed by the name of an associated constant. `.Self` is not permitted. -This notation is permitted anywhere a constraint can be -written, and results in a new constraint with a different interface: the named -member effectively no longer names an associated constant of the constrained -type, and is instead treated as a rewrite rule that expands to the right-hand -side of the constraint, with any mentioned parameters substituted into that -type. + +``` +interface RewriteSelf { + // ❌ Error: `.Self` is not the name of an associated constant. + let Me:! Type where .Self = Self; +} +interface HasAssoc { + let Assoc:! Type; +} +interface RewriteSingleLevel { + // ✅ Argument `y` has the same type `T.Element` as parameter `x`. + let A:! HasAssoc where .Assoc = i32; +} +interface RewriteMultiLevel { + // ❌ Error: Only one level of associated constant is permitted. + let B:! RewriteSingleLevel where .A.Assoc = i32; +} +``` + +This notation is permitted anywhere a constraint can be written, and results in +a new constraint with a different interface: the named member effectively no +longer names an associated constant of the constrained type, and is instead +treated as a rewrite rule the expands to the right-hand side of the constraint, +with any mentioned parameters substituted into that type. ``` interface Container { let Element:! Type; let Slice:! Container where .Element = Element; + fn Add[addr me: Self*](x: Element); } // `T.Slice.Element` rewritten to `T.Element` // because type of `T.Slice` says `.Element = Element`. // `T.Element` rewritten to `i32` // because type of `T` says `.Element = i32`. -fn A[T:! Container where .Element = i32](x: T.Slice.Element) {} +fn Add[T:! Container where .Element = i32](p: T*, y: T.Slice.Element) { + // ✅ Argument `y` has the same type `T.Element` as parameter `x`. + p->Add(y); +} ``` Rewrites aren't performed on the left-hand side of such an `=`, so @@ -237,38 +271,76 @@ is performed again, the program is invalid. ``` interface Container { let Element:! Type; - // (1), bad constraint - let Slice:! Container where .Element = .Slice.Element; + let Slice:! Container where .Element = Element; +} +// ❌ Error: `.Slice.Element` rewritten to `.Element` by (1), +// resulting in `where .Element = .Element`. +// `T.Element` is then rewritten to `T.Element`, +// to which the same rewrite applies again. +fn Bad[T:! Container where .Element = .Slice.Element](x: T.Element) {} +``` + +``` +interface Helper { + let D:! Type; } -// Error, `T.Slice.Element` rewritten to -// `T.Slice.Slice.Element` by (1), then to -// `T.Slice.Slice.Slice.Element` by (1) -fn A[T:! Container](x: T.Slice.Element) {} +interface Example { + let B:! Type; + let C:! Helper where .D = B; +} +// ✅ `where .D = ...` by itself is fine. +// `T.C.D` is rewritten to `T.B`. +fn Allowed(T:! Example, x: T.C.D); +// ❌ But combined with another rewrite, creates an infinite loop. +// `.C.D` is rewritten to `.B`, resulting in `where .B = .B`. +// Then `T.C.D` is rewritten to `T.B`, which is rewritten to itself repeatedly. +// Using `==` instead of `=` would make this constraint redundant, +// rather than it being an error. +fn Error(T:! Example where .B = .C.D, x: T.C.D); ``` +To determine whether two rewrite constraints are the same, each such constraint +tracks the location of its `=` token and an optional location of a `:!` +binding. When a constraint is used in a `:!` binding, its rewrite constraints +are replaced with ones that track the location of that `:!` binding. Two +rewrite constraints are the same if they track the same `=` location and the +same optional `:!` location. + +> TODO: Alternative rule: if the rewrite constraint would rewrite its own +> right-hand side, then reject. So far we have neither a proof that this is +> sufficient to ensure termination nor a counterexample. + It is valid for the same rewrite rule to be used multiple times within a type expression, so long as it is not used during substitution into its own right-hand side: ``` -interface BInterface { - let C:! Type; +interface Allowed { + let A:! Allowed where .A = .Self; +} +fn F(T:! Allowed, x: T.A.A.A); +// ✅ Allowed, since rewrite is only applied once per sub-expression +// In `(T.A.A).A`, the inner `T.A.A` is rewritten to `T.A`, +// resulting in `(T.A).A`, which is then rewritten to `T.A`. +``` + +This rule does disallow some cases which would terminate: + +``` +interface FalsePositive; +constraint ForwardDeclaredConstraint(X:! Error3); +interface FalsePositive { + let X:! FalsePositive; + // Means `Y:! FalsePositive where .X = X.Y` + let Y:! ForwardDeclaredConstraint(X); } -interface AInterface { - let D:! AInterface; - let B:! BInterface where .C = Self.D; +constraint ForwardDeclaredConstraint(X:! FalsePositive) { + extends FalsePositive where .X = X.Y; } -// `A.B.C` is rewritten to `Self.D`, because the type of `A.B` says `.C = Self.D`. -// `Self`=`A` is substituted into this giving `A.D`. -// During this substitution, the same rewrite rule is not used again, -// so this is OK. -// Then `A.D.B.C` is rewritten to `Self.D`, because the type of `A.D.B` says -// `.C = Self.D`. -// `Self`=`A.D` is substituted into this giving `A.D.D`. -// During this substitution, the same rewrite rule is not used again, -// so this is also OK, even though the same rule was used twice -// in forming the overall type. -fn F(A:! AInterface, d: A.B.C.B.C) {} +// T.Y.Y.X -> T.Y.X.Y -> T.X.Y.Y +// ❌ Error: applies the same rewrite twice, even +// though there is no infinite loop. +fn F4(T:! FalsePositive, x: T.Y.Y.X); ``` Rewrite constraints thereby give us a deterministic, terminating type From 5854d0b6c6201110745ef068eb689d16dfca3143 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Sep 2022 19:10:35 -0700 Subject: [PATCH 06/19] Add some more considered alternatives. --- proposals/p2173.md | 50 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/proposals/p2173.md b/proposals/p2173.md index c97855b9024d6..895fccf91b4e3 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -517,6 +517,56 @@ of how we would do this have not been worked out, but this might lead to a coherent rule that has transitive type equality. However, this would not solve the "equal types with different interfaces" problem. +### Find a fully transitive approach to type equality + +This proposal makes type equality transitive, but still has non-transitive +"same-type" constraints describing types that are known to always eventual be +the same after substitution, but which are not treated as the same type value +while type-checking a generic. We could seek a type equality rule that would +avoid having these two different kinds of equality. + +One candidate approach is to accept that the full generality of the equality +problem is +[hard](https://link.springer.com/content/pdf/10.1007/3-540-17220-3_7.pdf), but +attempt to solve it anyway. Because of the close correspondence between Turing +machines and string rewriting systems, this results in a type system with no +upper bound on its runtime. We consider this unacceptable for Carbon. Swift +handles this problem by placing a limit on the complexity of cases they will +accept, but that seems undesirable too, as the rule cannot be readily +understood by a developer nor effectively worked around. + +Another candidate approach is to find a way to reduce the inputs to the +type-checking step as a set of non-quantified equalities. The resulting system +of equalities can then be +[solved efficiently](https://dl.acm.org/doi/pdf/10.1145/322186.322198) to +determine whether two types are known to be the same. This approach appears +sufficient to cope with locally-declared constraints, but when constraints +appear within interfaces, they can give rise to an infinite family of +equalities, and picking any finite subset risks the result being non-transitive +in general, if a different subset of equalities might be considered in a +different type-checking context. + +### Use a different rule for "same rewrite" + +We considered various rules for determining if two rewrites are using the same +rule. The goal is to select a rule that ensures there are only finitely many +different rewrites in a program, so that any non-termination will always be +detected, and to minimize the occurrence of false-positives. + +One candidate was to consider rewrites the same if they apply to the same +associated constant. However, it seems likely that the false positive rate for +such a rule would be unacceptably high. + +Another rule would be to consider rewrites the same if they correspond to the +same `=` token. This has the disadvantage that factoring out a common +subexpression can result in two constraints that were previously considered +different to now be considered the same, potentially breaking users. + +Connecting sameness to both the `=` token and the `:!` token is intended to +minimize the risk of refactorings introducing false positives, while retaining +a set of rewrite constraints that grows at most quadratically in the size of +the program, and in practice much slower than that. + ### Different names for same-type constraint We considered various options for the spelling of a same-type constraint: From 2855d574b0c157b79a030093d353a560e75c79b1 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Sep 2022 19:12:50 -0700 Subject: [PATCH 07/19] Presubmit. --- proposals/p2173.md | 50 ++++++++++++++++++++++++---------------------- 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 895fccf91b4e3..303a689ef508a 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -31,6 +31,8 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Alternatives considered](#alternatives-considered) - [Status quo](#status-quo) - [Restrict constraints to allow computable type equality](#restrict-constraints-to-allow-computable-type-equality) + - [Find a fully transitive approach to type equality](#find-a-fully-transitive-approach-to-type-equality) + - [Use a different rule for "same rewrite"](#use-a-different-rule-for-same-rewrite) - [Different names for same-type constraint](#different-names-for-same-type-constraint) @@ -171,21 +173,21 @@ There has been a lot of work in this space. Some previous relevant proposals: whether and how we might restrict constraints in order to give a decidable type equality rule. - #818 represents the status quo of approved proposals. It has both `=` - and `==` where constraints, which are similar to, but somewhat - different from, those in this proposal. + and `==` where constraints, which are similar to, but somewhat different + from, those in this proposal. - `=` constraints require a concrete type on their right-hand side. The - left-hand side is not restricted. The type-of-type of the left-hand - side is changed to match the right-hand side. + left-hand side is not restricted. The type-of-type of the left-hand side + is changed to match the right-hand side. - `==` constraints do not affect the type-of-type of either operand. - - As an ansewr to the question of how to produce a decidable type - equality rule, both kinds of constraints are applied automatically, but - only at a depth of one constraint. Neither form is transitive but both - can be transitively extended using `observe` declarations. + - As an ansewr to the question of how to produce a decidable type equality + rule, both kinds of constraints are applied automatically, but only at a + depth of one constraint. Neither form is transitive but both can be + transitively extended using `observe` declarations. - [#2070: always `==` not `=` in `where` clauses](https://github.com/carbon-language/carbon-lang/pull/2070) describes an attempt to unify the syntax and semantics of `=` and `==` constraints, under which we would always merge the type-of-types of the - operands. That proposal has not yet been accepted, and this proposal - intends to supersede it. + operands. That proposal has not yet been accepted, and this proposal intends + to supersede it. ## Proposal @@ -206,10 +208,10 @@ constraint: language rules for any purpose, but there is an `ImplicitAs(T)` impl for types implementing `SameAs(T)`. -A type implements `C where .A = T` if and only if it implements `C where .A == -T`, assuming both constraints are valid – the inhabitants of these two -type-of-types are the same, but they are different type-of-types and provide -different interfaces for corresponding type values. +A type implements `C where .A = T` if and only if it implements +`C where .A == T`, assuming both constraints are valid – the inhabitants of +these two type-of-types are the same, but they are different type-of-types and +provide different interfaces for corresponding type values. ## Details @@ -300,11 +302,11 @@ fn Error(T:! Example where .B = .C.D, x: T.C.D); ``` To determine whether two rewrite constraints are the same, each such constraint -tracks the location of its `=` token and an optional location of a `:!` -binding. When a constraint is used in a `:!` binding, its rewrite constraints -are replaced with ones that track the location of that `:!` binding. Two -rewrite constraints are the same if they track the same `=` location and the -same optional `:!` location. +tracks the location of its `=` token and an optional location of a `:!` binding. +When a constraint is used in a `:!` binding, its rewrite constraints are +replaced with ones that track the location of that `:!` binding. Two rewrite +constraints are the same if they track the same `=` location and the same +optional `:!` location. > TODO: Alternative rule: if the rewrite constraint would rewrite its own > right-hand side, then reject. So far we have neither a proof that this is @@ -532,8 +534,8 @@ attempt to solve it anyway. Because of the close correspondence between Turing machines and string rewriting systems, this results in a type system with no upper bound on its runtime. We consider this unacceptable for Carbon. Swift handles this problem by placing a limit on the complexity of cases they will -accept, but that seems undesirable too, as the rule cannot be readily -understood by a developer nor effectively worked around. +accept, but that seems undesirable too, as the rule cannot be readily understood +by a developer nor effectively worked around. Another candidate approach is to find a way to reduce the inputs to the type-checking step as a set of non-quantified equalities. The resulting system @@ -563,9 +565,9 @@ subexpression can result in two constraints that were previously considered different to now be considered the same, potentially breaking users. Connecting sameness to both the `=` token and the `:!` token is intended to -minimize the risk of refactorings introducing false positives, while retaining -a set of rewrite constraints that grows at most quadratically in the size of -the program, and in practice much slower than that. +minimize the risk of refactorings introducing false positives, while retaining a +set of rewrite constraints that grows at most quadratically in the size of the +program, and in practice much slower than that. ### Different names for same-type constraint From 1ff011e3709172b01dc97d8c6d388186e97f3bfb Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 21 Sep 2022 14:57:43 -0700 Subject: [PATCH 08/19] Respond to review comments. --- proposals/p2173.md | 37 +++++++++++++++++++++++++++---------- 1 file changed, 27 insertions(+), 10 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 303a689ef508a..a033ef7888635 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -179,7 +179,7 @@ There has been a lot of work in this space. Some previous relevant proposals: left-hand side is not restricted. The type-of-type of the left-hand side is changed to match the right-hand side. - `==` constraints do not affect the type-of-type of either operand. - - As an ansewr to the question of how to produce a decidable type equality + - As an answer to the question of how to produce a decidable type equality rule, both kinds of constraints are applied automatically, but only at a depth of one constraint. Neither form is transitive but both can be transitively extended using `observe` declarations. @@ -229,7 +229,7 @@ interface HasAssoc { let Assoc:! Type; } interface RewriteSingleLevel { - // ✅ Argument `y` has the same type `T.Element` as parameter `x`. + // ✅ Uses of `A.Assoc` will be rewritten to `i32`. let A:! HasAssoc where .Assoc = i32; } interface RewriteMultiLevel { @@ -255,7 +255,8 @@ interface Container { // `T.Element` rewritten to `i32` // because type of `T` says `.Element = i32`. fn Add[T:! Container where .Element = i32](p: T*, y: T.Slice.Element) { - // ✅ Argument `y` has the same type `T.Element` as parameter `x`. + // ✅ Argument `y` has the same type `i32` as parameter `x` of + // `T.(Container.Add)`, which is also rewritten to `i32`. p->Add(y); } ``` @@ -330,7 +331,7 @@ This rule does disallow some cases which would terminate: ``` interface FalsePositive; -constraint ForwardDeclaredConstraint(X:! Error3); +constraint ForwardDeclaredConstraint(X:! FalsePositive); interface FalsePositive { let X:! FalsePositive; // Means `Y:! FalsePositive where .X = X.Y` @@ -524,18 +525,34 @@ the "equal types with different interfaces" problem. This proposal makes type equality transitive, but still has non-transitive "same-type" constraints describing types that are known to always eventual be the same after substitution, but which are not treated as the same type value -while type-checking a generic. We could seek a type equality rule that would -avoid having these two different kinds of equality. +while type-checking a generic. This proposal also imposes a directionality on +rewrites, a restriction to only rewrite at a depth of exactly one associated +constant, and a restriction that only one value can be specified as equal to a +given associated constant. We could seek a type equality rule that would avoid +having two different kinds of equality and would avoid some or all of the +restrictions placed on rewrite constraints. One candidate approach is to accept that the full generality of the equality problem is [hard](https://link.springer.com/content/pdf/10.1007/3-540-17220-3_7.pdf), but attempt to solve it anyway. Because of the close correspondence between Turing machines and string rewriting systems, this results in a type system with no -upper bound on its runtime. We consider this unacceptable for Carbon. Swift -handles this problem by placing a limit on the complexity of cases they will -accept, but that seems undesirable too, as the rule cannot be readily understood -by a developer nor effectively worked around. +upper bound on its runtime. We consider this unacceptable for Carbon, but it is +the approach taken by some other modern languages: + +- Swift has an + [undecidable type system](https://forums.swift.org/t/swift-type-checking-is-undecidable/39024) + due to this, and handles the problem by placing a limit on the complexity of + cases they will accept, but that seems unsuited to Carbon's goals, as the + rule cannot be readily understood by a developer nor effectively worked + around. +- Rust also has an + [undecidable type system (content warning: swear words)](https://sdleffler.github.io/RustTypeSystemTuringComplete/) + for the same reason, and similarly has a recursion limit. + +See also [Typing is Hard](https://3fx.ch/typing-is-hard.html), which lists +similar information for a variety of other languages. Note that most languages +listed have an undecidable type system. Another candidate approach is to find a way to reduce the inputs to the type-checking step as a set of non-quantified equalities. The resulting system From b039f41d468bf14f7ff1b81e6fefdde953904e1e Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 21 Sep 2022 17:46:09 -0700 Subject: [PATCH 09/19] Add precise rules and termination argument. Remove the "recursively uses the same rewrite" rule which appears to be unnecessary to guarantee termination. --- proposals/p2173.md | 648 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 573 insertions(+), 75 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index a033ef7888635..777e4f700d748 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -23,6 +23,13 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Details](#details) - [Rewrite constraints](#rewrite-constraints) - [Combining constraints with `&`](#combining-constraints-with-) + - [Combining constraints with `and`](#combining-constraints-with-and) + - [Precise rules and termination](#precise-rules-and-termination) + - [Qualified name lookup](#qualified-name-lookup) + - [Type substitution](#type-substitution) + - [Mutual and self recursion](#mutual-and-self-recursion) + - [Examples](#examples) + - [Termination](#termination) - [Same type constraints](#same-type-constraints) - [Observe declarations](#observe-declarations) - [Implementing an interface](#implementing-an-interface) @@ -34,6 +41,9 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Find a fully transitive approach to type equality](#find-a-fully-transitive-approach-to-type-equality) - [Use a different rule for "same rewrite"](#use-a-different-rule-for-same-rewrite) - [Different names for same-type constraint](#different-names-for-same-type-constraint) + - [Topological ordering for `=` in `and`](#topological-ordering-for--in-and) + - [Multi-constraint `where` clauses](#multi-constraint-where-clauses) + - [Forward references in `.Self` substitution](#forward-references-in-self-substitution) @@ -266,20 +276,422 @@ Rewrites aren't performed on the left-hand side of such an `=`, so Instead, such a `where` clause is invalid unless each rule for `.A` specifies the same rewrite. -When an `=` appears within an interface, it is possible to form a cycle where a -type becomes more complex with each iteration. In order to avoid this problem, -if, when substituting into the right-hand side of such an `=`, the same rewrite -is performed again, the program is invalid. +Note that given `T:! C where .R = i32` can result in a type `T.R` whose behavior +is different from the behavior of `T.R` given `T:! C`. For example, member +lookup into `T.R` can find different results and operations can therefore have +different behavior. However, this does not violate coherence because `C` and +`C where .R = i32` don't differ by merely having more type information; rather, +they are different types that have an isomorphic set of values, somewhat like +`i32` and `u32`. An `=` constraint is not merely learning a new fact about a +type, it is requesting different behavior. + +### Combining constraints with `&` + +Suppose we have `X = C where .R = A` and `Y = C where .R = B`. What should +`C & X` produce? What should `X & Y` produce? + +We could perhaps say that `X & Y` results in a constraint where the type of `R` +has the union of the interface of `A` and the interface of `B`, and that `C & X` +similarly results in a constraint where the type of `R` has the union of the +interface of `A` and the interface originally specified by `C`. However, this +proposal suggests a simpler rule: + +- Combining two rewrite rules with different rewrite targets results in a + constraint where the associated constant is ambiguous. Given `T:! X & Y`, + the type expression `T.R` is ambiguous between a rewrite to `A` and a + rewrite to `B`. But given `T:! X & X`, `T.R` is unambiguously rewritten to + `A`. +- Combining a constraint with a rewrite rule with a constraint with no rewrite + rule preserves the rewrite rule. For example, supposing that + `interface Container` extends `interface Iterable`, and `Iterable` has an + associated constant `Element`, the constraint + `Container & (Iterable where .Element = i32)` is the same as the constraint + `(Container & Iterable) where .Element = i32` which is the same as the + constraint `Container where .Element = i32`. + +### Combining constraints with `and` + +It's possible for one `=` constraint in a `where` to refer to another. When this +happens, the constraint `C where A and B` is interpreted as +`(C where A) where B`, so rewrites in `A` are applied to names in `B`, but +rewrites in `B` are not applied to names in `A`: + +``` +interface C { + let T:! Type; + let U:! Type; +} +// ✅ Same as `(C where .T = i32) where .U = .T`, which is +// the same as `(C where .T = i32) where .U = i32`, which is +// the same as `C where .T = i32 and .U = i32`. +fn F[A:! C where .T = i32 and .U = .T]() {} +``` + +If a constraint is formed where a rewrite would apply to the right-hand side of +another rewrite in the same constraint, but is not applied because it was not in +scope, the program is invalid. + +``` +// ❌ Rewritten member `.U` is mentioned in a context +// where its rewrite is not in scope. +fn BadG[A:! C where .T = .U and .U = i32]() {} +// ❌ Rewritten member `.U` is mentioned in a context +// where its rewrite is not in scope. +fn BadH[A:! C where .T = .U and .U = .T]() {} +// ❌ Rewritten member `.T` is mentioned in a context +// where its rewrite is not in scope. +fn BadI[A:! (C where .T = i32) & (C where .U = .T)]() {} +``` + +In more detail, what happens here is: + +1. The constraint is fully elaborated. `.X` is replaced with `.Self.X`. For + example, `C where .T = .U and .U = i32` is rewritten to + `C where .T = .Self.U and .U = i32`. +2. When the constraint is applied to a type `V`, the substitution `.Self = V` + is performed. The type of type used for `V` is + `C where .T = and .U = `. +3. During this substitution, if either `` rewrite rule would be used, + the result is an error. Substitution into `.Self.U` produces `V.U`, giving + an error. + +This is explained further [below](#mutual-and-self-recursion). + +### Precise rules and termination + +This section explains the detailed rules used to implement rewrites. There are +two properties we aim to satisfy: + +1. After type-checking, no symbolic references to associated constants that + have an associated rewrite rule remain. +2. Type-checking always terminates in a reasonable amount of time, ideally + linear in the size of the problem. + +Rewrites are applied in two different phases of program analysis. + +- During qualified name lookup and type checking for qualified member access, + if a rewritten member is looked up, the right-hand side's value and type are + used for subsequent checks. +- During substitution, if the symbolic name of an associated constant is + substituted into, and substitution into the left-hand side results in a + value with a rewrite for that constant, that rewrite is applied. + +In each case, we always rewrite to a value that satisfies property 1 above, and +these two steps are the only places where we might form a symbolic reference to +an associated cosntant, so property 1 is recursively satisfied. Moreover, we +apply only one rewrite in each of the above cases, satisfying property 2. + +#### Qualified name lookup + +Qualified name lookup into either a type parameter or into an expression whose +type is a symbolic type `T` -- either a type parameter or an associated type -- +considers names from the constraint `C` on `T`, that is, from `T`’s declared +type. + +``` +interface C { + let M:! i32; + let U:! C; +} +fn F[T:! C](x: T) { + // Value is C.M in all four of these + let a: i32 = x.M; + let b: i32 = T.M; + let c: i32 = x.U.M; + let d: i32 = T.U.M; +} +``` + +When looking up the name `N`, if `C` is an interface `I` and `N` is the name of +an associated constant in that interface, the result is a symbolic value +representing "the member `N` of `I`". If `C` is formed by combining interfaces +with `&`, all such results are required to find the same associated constant, +otherwise we reject for ambiguity. + +If a member of a class or interface is named in a qualified name lookup, the +type of the result is determined by performing a substitution. For an interface, +`Self` is substituted for the self type, and any parameters for that class or +interface (and enclosing classes or interfaces, if any) are substituted into the +declared type. + +``` +interface SelfIface { + fn Get[me: Self]() -> Self; +} +class UsesSelf(T:! Type) { + // Equivalent to `fn Make() -> UsesSelf(T)*;` + fn Make() -> Self*; + impl as SelfIface; +} + +// ✅ `T = i32` is substituted into the type of `UsesSelf(T).Make`, +// so the type of `UsesSelf(i32).Make` is `fn () -> UsesSelf(i32)*`. +let x: UsesSelf(i32)* = UsesSelf(i32).Make(); + +// ✅ `Self = UsesSelf(i32)` is substituted into the type +// of `SelfIface.Get`, so the type of `UsesSelf(i32).(SelfIface.Get)` +// is `fn [me: UsesSelf(i32)]() -> UsesSelf(i32)`. +let y: UsesSelf(i32) = x->Get(); +``` + +None of this is new in this proposal. This proposal adds the rule: + +If a constraint `C` into which lookup is performed includes a `where` clause +saying `.N = U`, and the result of qualified name lookup is the associated +constant `N`, that result is replaced by `U`, and the type of the result is the +type of `U`. No substitution is performed in this step, not even a `Self` +substitution -- any necessary substitutions were already performed when forming +the constraint `C`, and we don’t consider either the declared type or value of +the associated constraint at all for this kind of constraint. Going through an +example in detail: + +``` +interface A { + let T:! Type; +} +interface B { + let U:! Type; + // More explicitly, this is of type `A where .(A.T) = Self.(B.U)` + let V:! A where .T = U; +} +// Type of T is B. +fn F[T:! B](x: T) { + // The type of the expression `T` is `B`. + // `T.V` finds `B.V` with type `A where .(A.T) = Self.(B.U)`. + // We substitute `Self` = `T` giving the type of `u` as + // `A where .(A.T) = T.(B.U)`. + let u:! auto = T.V; + // The type of `u` is `A where .(A.T) = T.(B.U)`. + // Lookup for `u.T` resolves it to `u.(A.T)`. + // So the result of the qualified member access is `T.(B.U)`, + // and the type of `v` is the type of `T.(B.U)`, namely `Type`. + // No substitution is performed in this step. + let v:! auto = u.T; +} +``` + +The more complex case of + +``` +fn F2[T:! B where .U = i32](x: T); +``` + +is discussed later. + +#### Type substitution + +At various points during the type-checking of a Carbon program, we need to +substitute a set of (binding, value) pairs into a symbolic value. We saw an +example above: substituting `Self = T` into the type `A where .(A.T) = Self.U` +to produce the value `A where .(A.T) = T.U`. Another important case is the +substitution of inferred parameter values into the type of a function when +type-checking a function call: + +``` +fn F[T:! C](x: T) -> T; +fn G(n: i32) -> i32 { + // Deduces T = i32, which is substituted + // into the type `fn (x: T) -> T` to produce + // `fn (x: i32) -> i32`, giving `i32` as the type + // of the call expression. + return F(n); +} +``` + +Qualified name lookup is not re-done as a result of type substitution. For a +template, we imagine there’s a completely separate step that happens before type +substitution, where qualified name lookup is redone based on the actual value of +template arguments; this proceeds as described in the previous section. +Otherwise, we performed the qualified name lookup when type-checking the +generic, and do not do it again: + +``` +interface IfaceHasX { + let X:! Type; +} +class ClassHasX { + class X {} +} +class HasAssoc { + let Assoc:! IfaceHasX; +} + +// Qualified name lookup finds `T.(HasAssoc.Assoc).(IfaceHasX.X)`. +fn F(T:! HasAssoc) -> T.Assoc.X; + +fn G(T:! HasAssoc where .Assoc = ClassHasX) { + // `T.Assoc` rewritten to `ClassHasX` by qualified name lookup. + // Names `ClassHasX.X`. + var a: T.Assoc.X = {}; + // Substitution produces `ClassHasX.(IfaceHasX.X)`, + // not `ClassHasX.X`. + var b: auto = F(T); +} +``` + +During substitution, we might find a member access that named an opaque symbolic +associated constant in the original value can now be resolved to some specific +value. It’s important that we perform this resolution: + +``` +interface A { + let T:! Type; +} +class K { fn Member(); } +fn H[U:! A](x: U) -> U.T; +fn J[V:! A where .T = K](y: V) { + // We need the interface of `H(y)` to include + // `K.Member` in order for this lookup to succeed. + H(y).Member(); +} +``` + +The values being substituted into the symbolic value are themselves already +fully substituted and resolved, and in particular, satisfy property 1 given +above. + +Substitution proceeds by recursively rebuilding each symbolic value, bottom-up, +replacing each substituted binding with its value. Doing this naively will +propagate values like `i32` in the `F`/`G` case earlier in this section, but +will not propagate rewrite constants like in the `H`/`J` case. The reason is +that the `.T = K` constraint is in the _type_ of the substituted value, rather +than in the substituted value itself: deducing `T = i32` and converting `i32` to +the type `C` of `T` preserves the value `i32`, but deducing `U = V` and +converting `V` to the type `A` of `U` discards the rewrite constraint. + +In order to apply rewrites during substitution, we associate a set of rewrites +with each value produced by the recursive rebuilding procedure. This is somewhat +like having substitution track a refined type-of-type for each value, but we +don’t need -- or want -- for the type to change during this process, only for +the rewrites to be properly applied. For a substituted binding, this set of +rewrites is the rewrites found on the type of the corresponding value prior to +conversion to the type of the binding. When rebuilding a member access +expression, if we have a rewrite corresponding to the accessed member, then the +resulting value is the target of the rewrite, and its set of rewrites is that +found in the type of the target of the rewrite, if any. Because the target of +the rewrite is fully resolved already, we can ask for its type without +triggering additional work. In other cases, the rewrite set is empty; all +necessary rewrites were performed when type-checking the value we're +substituting into. + +Continuing an example from [qualified name lookup](#qualified-name-lookup): + +``` +interface A { + let T:! Type; +} +interface B { + let U:! Type; + let V:! A where .T = U; +} + +// Type of the expression `T` is `B where .(B.U) = i32` +fn F2[T:! B where .U = i32](x: T) { + // The type of the expression `T` is `B where .U = i32`. + // `T.V` is looked up and finds the associated type `(B.V)`. + // The declared type is `A where .(A.T) = Self.U`. + // We substitute `Self = T` with rewrite `.U = i32`. + // The resulting type is `A where .(A.T) = i32`. + // So `u` is `T.V` with type `A where .(A.T) = i32`. + let u:! auto = T.V; + // The type of `u` is `A where .(A.T) = i32`. + // Lookup for `u.T` resolves it to `u.(A.T)`. + // So the result of the qualified member access is `i32`, + // and the type of `v` is the type of `i32`, namely `Type`. + // No substitution is performed in this step. + let v:! auto = u.T; +} +``` + +#### Mutual and self recursion + +Some cases need special attention. + +One such case is the mutual recursion case `C where .T = .U and .U = .T`. This +is interpreted as `(C where .T = .U) where .U = .T`, which is rewritten as +`(C where .T = .U) where .U = .U` because the type of `.Self` in the `.U = .T` +constraint is `C where .T = .U`. Indeed, all mutual recursion cases in a single +`where` clause end up reducing to self-recursion -- after substituting all +previous rewrites into the final rewrite in the cycle, it must refer to itself. + +Another case needing special attention is self-recursion, which mutual recursion +in the same `where` clause reduces to. Specifically, for cases of the form +`C where .T = F(.T)` for arbitrary `F`. A third case is mutual recursion where +the constraints do not come from the same `where` clause, such as in +`(C where .T = .U) & (C where .U = .T)`. + +To handle these cases, we need to specify what happens when a constraint is +applied to a type. In any context where a constraint is applied, a substitution +is performed, substituting the `.Self` of the constraint for the type to which +it is applied: + +``` +interface HasAssoc { + let Assoc:! Type; +} +// The fully-elaborated declared type of `T` is +// `HasAssoc where .Self.(HasAssoc.Assoc) is AddableWith(.Self)`. +fn F[T:! HasAssoc where .Assoc is AddableWith(.Self)]() { + // To form the type of `T`, we substitute `.Self = T` + // into its declared type, producing + // `HasAssoc where T.(HasAssoc.Assoc) is AddableWith(T)`. + let x: T; +} +``` + +In the case where the value being substituted for `.Self` has rewrites, we need +to specify what that set of rewrites is, as input to the substitution process. +But the precise values aren't known yet, because we've not yet substituted into +them. Instead, we perform this step with rewrite rules that map to an "error" +marker. That is, if any of the rewrites from the type are applied during +substitution, we consider the substitution to be invalid. + +``` +fn F[U:! C where .T = .T](x: U) { + var v: U.T; +} +``` + +Here, the elaborated declared type of `U` is `C where .(C.T) = .Self.(C.T)`. But +that’s not the type of the expression `U` – to get that type, we need to +substitute `.Self = U` into the declared type. That substitution is performed +with the rewrite rule `U.(C.T) = `, and when substituting into the right +hand side of the `=` constraint, we attempt to perform that rewrite, and reject +this program. + +It is possible for a constraint to be formed that is not used in the same +library, and for which a `.Self` substitution would always fail. For example: + +``` +package Broken api; + +interface X { + let T:! Type; + let U:! Type; +} +let Bad:! auto = (X where .T = .U) & (X where .U = .T); +// Bad is not used here. +``` + +In such cases, the constraint `Broken.Bad` is not usable: any attempt to use +that constraint to constrain a type would perform a `.Self` substitution, which +would always fail because it would use a rewrite that's not yet available. In +order to ensure that such cases are diagnosed, a trial substitution of a +placeholder type for `.Self` is performed for all constraints. Note that this +trial substitution can be skipped for constraints that are actually used, which +is the common case. + +#### Examples ``` interface Container { let Element:! Type; let Slice:! Container where .Element = Element; } -// ❌ Error: `.Slice.Element` rewritten to `.Element` by (1), -// resulting in `where .Element = .Element`. -// `T.Element` is then rewritten to `T.Element`, -// to which the same rewrite applies again. +// ❌ Qualified name lookup rewrites this constraint to +// `Container where .Element = .Self.Element`. +// Substituting `.Self = T` with rewrite `.Element = ` +// results in an error because it forms `T.Element`. fn Bad[T:! Container where .Element = .Slice.Element](x: T.Element) {} ``` @@ -295,94 +707,76 @@ interface Example { // `T.C.D` is rewritten to `T.B`. fn Allowed(T:! Example, x: T.C.D); // ❌ But combined with another rewrite, creates an infinite loop. -// `.C.D` is rewritten to `.B`, resulting in `where .B = .B`. -// Then `T.C.D` is rewritten to `T.B`, which is rewritten to itself repeatedly. +// `.C.D` is rewritten to `.B`, resulting in `where .B = .B`, +// which causes an error during `.Self` substitution. // Using `==` instead of `=` would make this constraint redundant, // rather than it being an error. fn Error(T:! Example where .B = .C.D, x: T.C.D); ``` -To determine whether two rewrite constraints are the same, each such constraint -tracks the location of its `=` token and an optional location of a `:!` binding. -When a constraint is used in a `:!` binding, its rewrite constraints are -replaced with ones that track the location of that `:!` binding. Two rewrite -constraints are the same if they track the same `=` location and the same -optional `:!` location. - -> TODO: Alternative rule: if the rewrite constraint would rewrite its own -> right-hand side, then reject. So far we have neither a proof that this is -> sufficient to ensure termination nor a counterexample. - -It is valid for the same rewrite rule to be used multiple times within a type -expression, so long as it is not used during substitution into its own -right-hand side: - ``` interface Allowed { let A:! Allowed where .A = .Self; } -fn F(T:! Allowed, x: T.A.A.A); -// ✅ Allowed, since rewrite is only applied once per sub-expression +// ✅ The type of `x` is `T.A`. // In `(T.A.A).A`, the inner `T.A.A` is rewritten to `T.A`, // resulting in `(T.A).A`, which is then rewritten to `T.A`. +fn F(T:! Allowed, x: T.A.A.A); ``` -This rule does disallow some cases which would terminate: - ``` -interface FalsePositive; -constraint ForwardDeclaredConstraint(X:! FalsePositive); -interface FalsePositive { - let X:! FalsePositive; - // Means `Y:! FalsePositive where .X = X.Y` +interface MoveYsRight; +constraint ForwardDeclaredConstraint(X:! MoveYsRight); +interface MoveYsRight { + let X:! MoveYsRight; + // Means `Y:! MoveYsRight where .X = X.Y` let Y:! ForwardDeclaredConstraint(X); } -constraint ForwardDeclaredConstraint(X:! FalsePositive) { - extends FalsePositive where .X = X.Y; +constraint ForwardDeclaredConstraint(X:! MoveYsRight) { + extends MoveYsRight where .X = X.Y; } -// T.Y.Y.X -> T.Y.X.Y -> T.X.Y.Y -// ❌ Error: applies the same rewrite twice, even -// though there is no infinite loop. -fn F4(T:! FalsePositive, x: T.Y.Y.X); +// ✅ The type of `x` is `T.X.Y.Y`. +// The type of `T` is `MoveYsRight`. +// The type of `T.Y` is determined as follows: +// - Qualified name lookup finds `MoveYsRight.Y`. +// - The declared type is `ForwardDeclaredConstraint(Self.X)`. +// - That is a named constraint, for which we perform substitution. +// Substituting `X = Self.X` gives the type +// `MoveYsRight where .X = Self.X.Y`. +// - Substituting `Self = T` gives the type +// `MoveYsRight where .X = T.X.Y`. +// The type of `T.Y.Y` is determined as follows: +// - Qualified name lookup finds `MoveYsRight.Y`. +// - The declared type is `ForwardDeclaredConstraint(Self.X)`. +// - That is a named constraint, for which we perform substitution. +// Substituting `X = Self.X` gives the type +// `MoveYsRight where .X = Self.X.Y`. +// - Substituting `Self = T.Y` with +// rewrite `.X = T.X.Y` gives the type +// `MoveYsRight where .X = T.Y.X.Y`, but +// `T.Y.X` is replaced by `T.X.Y`, giving +// `MoveYsRight where .X = T.X.Y.Y`. +// The type of `T.Y.Y.X` is determined as follows: +// - Qualified name lookup finds `MoveYsRight.X`. +// - The type of `T.Y.Y` says to rewrite that to `T.X.Y.Y`. +// - The result is `T.X.Y.Y`, of type `MoveYsRight`. +fn F4(T:! MoveYsRight, x: T.Y.Y.X); ``` -Rewrite constraints thereby give us a deterministic, terminating type -canonicalization mechanism for associated constants: in `A.B`, if the type of -`A` specifies that `.B == C`, then `A.B` is replaced by `C`. Equality of types -constrained in this way is transitive. - -Note that given `T:! C where .R = i32` can result in a type `T.R` whose behavior -is different from the behavior of `T.R` given `T:! C`. For example, member -lookup into `T.R` can find different results and operations can therefore have -different behavior. However, this does not violate coherence because `C` and -`C where .R = i32` don't differ by merely having more type information; rather, -they are different types that have an isomorphic set of values, somewhat like -`i32` and `u32`. An `=` constraint is not merely learning a new fact about a -type, it is requesting different behavior. - -### Combining constraints with `&` +#### Termination -Suppose we have `X = C where .R = A` and `Y = C where .R = B`. What should -`C & X` produce? What should `X & Y` produce? +Each of the above steps performs at most one rewrite, and doesn't introduce any +new recursive type-checking steps, so should not introduce any new forms of +non-termination. Rewrite constraints thereby give us a deterministic, +terminating type canonicalization mechanism for associated constants: in `A.B`, +if the type of `A` specifies that `.B == C`, then `A.B` is replaced by `C`. +Equality of types constrained in this way is transitive. -We could perhaps say that `X & Y` results in a constraint where the type of `R` -has the union of the interface of `A` and the interface of `B`, and that `C & X` -similarly results in a constraint where the type of `R` has the union of the -interface of `A` and the interface originally specified by `C`. However, this -proposal suggests a simpler rule: - -- Combining two rewrite rules with different rewrite targets results in a - constraint where the associated constant is ambiguous. Given `T:! X & Y`, - the type expression `T.R` is ambiguous between a rewrite to `A` and a - rewrite to `B`. But given `T:! X & X`, `T.R` is unambiguously rewritten to - `A`. -- Combining a constraint with a rewrite rule with a constraint with no rewrite - rule preserves the rewrite rule. For example, supposing that - `interface Container` extends `interface Iterable`, and `Iterable` has an - associated constant `Element`, the constraint - `Container & (Iterable where .Element = i32)` is the same as the constraint - `(Container & Iterable) where .Element = i32` which is the same as the - constraint `Container where .Element = i32`. +However, some existing forms of non-termination may remain, such as substitution +into a named constraint recursively triggering another such substitution, or +template instantiation triggering another template instantiation. Such cases +will need to be detected and handled in some way, such as by a depth limit, but +doing so doesn't compromise the soundness of the type system. ### Same type constraints @@ -631,3 +1025,107 @@ usage same-type constraints will have, compared to rewrite constraints, so it's hard to judge the other arguments in favor of and against other operators. It would be reasonable to revisit this decision when we have more usage information. + +### Topological ordering for `=` in `and` + +When `.X` is used on the right-hand side of an `=` and on the left-hand side of +an `=` in the same `where` clause, we could reorder constraints into a +topological order: + +``` +interface C { + let T:! Type; + let U:! Type; +} +// Could be reordered to `C where .U = i32 and .T = .U`. +fn G[A:! C where .T = .U and .U = i32]() {} +``` + +However, this is challenging to do in general: because the rewrite that we want +to fire in this case happens as part of qualified name lookup for `.U`, there's +a circularity: + +- we need to do name lookup to find which associated constants of `.Self` are + named where, +- we need to know which associated constants of `.Self` are named where in + order to know how to form a topological order, and +- we need to form the topological order in order to know which rewrites to + apply when doing name lookup. + +This may be solvable by repeatedly performing name lookup and type-checking +until we converge, but because a rewrite during name lookup changes the _type_ +of the result as well as the value, this may consider intermediate states not +involved in the final result and doesn't seem guaranteed to terminate. + +For example, consider `C where .T = .U.V and .U = i32`. The initial lookup for +`.V` in `.U.V` will fail when checked with `.U` of type `Type`, even though it +might succeed later if `i32` has a member of type `V`. And worse, for +`C where .T = F(.Self) and .U = G(.Self)` for arbitrary symbolically-evaluatable +functions `F` and `G`, whether `.T` refers to `.U` or the other way around can't +be seen without evaluating the function calls, and there's no guarantee that +doing so will ever converge. + +We could apply a topological sort only for the easy cases, where `.X` +syntactically appears after an `=`, and leave the rest of the cases to be +handled as in this proposal. That would add complexity and solve the majority of +cases, but would not solve the whole problem. + +More fundamentally, given the top-down processing generally used in Carbon, it +would likely be more surprising than helpful for rewrites to apply to earlier +uses of the same name in this one case, but not in general. + +### Multi-constraint `where` clauses + +Instead of treating `A where B and C` as `(A where B) where C`, we could model +it as `(A where B) & (A where C)`. + +Specifically, this would treat `C where .T = .U and .U = .T` as +`(C where .T = .U) & (C where .U = .T)`, where the type of `.Self` would be `C` +in both clauses, with both constraints applied "simultaneously" to `C`. This +would mean that the order of clauses doesn’t matter. If we do this, then neither +right-hand side is rewritten by name lookup, and would reject cases such as +`C where .T = i32 and .U = .T`. This case seems reasonable, and we would prefer +to accept it. + +### Forward references in `.Self` substitution + +Instead of saying that a `.Self` substitution is performed with a set of "error" +rewrite rules, we could say that it's performed with no rewrite rules at all. +Given: + +``` +fn F[X:! C where .T = .T](x: X, y: X.T); +``` + +... we would elaborate the declared type of `X` as +`C where .(C.T) = .Self.(C.T)`, and substituting in `.Self = X` gives +`C where .(C.T) = X.(C.T)`, where the right-hand side of the rewrite rule is a +_symbolic_ reference to the associated constant `X.(C.T)`. Hence the expression +`X.T` would be rewritten by name lookup into that symbolic constant, and the +effect would be as if there is no rewrite rule at all, except that specifying a +different value for `.T` in some later constraint would be prohibited. This +seems like it might be desirable behavior. + +In more complex cases, however, the behavior ceases being desirable. Given: + +``` +fn G[X:! (C where .T = .U) & (C where .U = .T)](x: X, t: X.T, u: X.U); +``` + +... the type of the expression `X` after substituting `.Self = X` is +`C where .(C.T) = .Self.(C.U) and .(C.U) = .Self.(C.T)`. So the type of `t` is +the symbolic type `X.U` and the type of `u` is the different symbolic type +`X.T`. + +Similarly, for: + +``` +fn H[X:! C where .T = .U and .U = i32](x: X, t: X.T, u: X.U); +``` + +... we find that the type of `u` is `i32` as desired, but the type of `t` is the +symbolic type `X.U`. Reversing the order of the `and` operands gives very +different behavior. + +Fundamentally, this choice allows violations of +[property 1](#precise-rules-and-termination), which seems highly undesirable. From 98bb5b332f1c1b2bfd118b68649f75b28f6ca3e1 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Dec 2022 16:53:16 -0800 Subject: [PATCH 10/19] Updates based on implementation experience. --- proposals/p2173.md | 495 ++++++++++++++++++++++++--------------------- 1 file changed, 266 insertions(+), 229 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 777e4f700d748..45765689dd733 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -273,14 +273,15 @@ fn Add[T:! Container where .Element = i32](p: T*, y: T.Slice.Element) { Rewrites aren't performed on the left-hand side of such an `=`, so `where .A = .B and .A = C` is not rewritten to `where .A = .B and .B = C`. -Instead, such a `where` clause is invalid unless each rule for `.A` specifies -the same rewrite. - -Note that given `T:! C where .R = i32` can result in a type `T.R` whose behavior -is different from the behavior of `T.R` given `T:! C`. For example, member -lookup into `T.R` can find different results and operations can therefore have -different behavior. However, this does not violate coherence because `C` and -`C where .R = i32` don't differ by merely having more type information; rather, +Instead, such a `where` clause is invalid when the constraint is +[resolved](#constraint-resolution) unless each rule for `.A` specifies the +same rewrite. + +Note that `T:! C where .R = i32` can result in a type `T.R` whose behavior is +different from the behavior of `T.R` given `T:! C`. For example, member lookup +into `T.R` can find different results and operations can therefore have +different behavior. However, this does not violate coherence because `C` and `C +where .R = i32` don't differ by merely having more type information; rather, they are different types that have an isomorphic set of values, somewhat like `i32` and `u32`. An `=` constraint is not merely learning a new fact about a type, it is requesting different behavior. @@ -311,51 +312,167 @@ proposal suggests a simpler rule: ### Combining constraints with `and` -It's possible for one `=` constraint in a `where` to refer to another. When this -happens, the constraint `C where A and B` is interpreted as -`(C where A) where B`, so rewrites in `A` are applied to names in `B`, but -rewrites in `B` are not applied to names in `A`: +It's possible for one `=` constraint in a `where` to refer to another. When +this happens, the constraint `C where A and B` is interpreted as `(C where A) +where B`, so rewrites in `A` are applied immediately to names in `B`, but +rewrites in `B` are not applied to names in `A` until the constraint is +[resolved](#constraint-resolution): ``` interface C { let T:! Type; let U:! Type; + let V:! Type; } -// ✅ Same as `(C where .T = i32) where .U = .T`, which is -// the same as `(C where .T = i32) where .U = i32`, which is -// the same as `C where .T = i32 and .U = i32`. -fn F[A:! C where .T = i32 and .U = .T]() {} +class M { + alias Me = Self; +} +// ✅ Same as `C where .T = M and .U = M.Me`, which is +// the same as `C where .T = M and .U = M`. +fn F[A:! C where .T = M and .U = .T.Me]() {} +// ❌ No member `Me` in `A.T:! Type`. +fn F[A:! C where .U = .T.Me and .T = M]() {} +``` + +### Combining constraints with `extends` + +Within an interface or named constraint, `extends` can be used to extend +a constraint that has rewrites. + +``` +interface A { + let T:! Type; + let U:! Type; +} +interface B { + extends A where .T = .U and .U = i32; +} + +var n: i32; + +// ✅ Resolved constraint on `T` is +// `B where .(A.T) = i32 and .(A.U) = i32`. +// `T.(A.T)` is rewritten to `i32`. +fn F(T:! B) -> T.(A.T) { return n; } ``` -If a constraint is formed where a rewrite would apply to the right-hand side of -another rewrite in the same constraint, but is not applied because it was not in -scope, the program is invalid. +### Combining constraints with `impl as` + +Within an interface or named constraint, the `impl T as C` syntax permits `=` +constraints. However, because these constraints don't change the type of `T`, +rewrite constraints will never result in a rewrite being performed, and instead +are equivalent to `==` constraints. ``` -// ❌ Rewritten member `.U` is mentioned in a context -// where its rewrite is not in scope. -fn BadG[A:! C where .T = .U and .U = i32]() {} -// ❌ Rewritten member `.U` is mentioned in a context -// where its rewrite is not in scope. -fn BadH[A:! C where .T = .U and .U = .T]() {} -// ❌ Rewritten member `.T` is mentioned in a context -// where its rewrite is not in scope. -fn BadI[A:! (C where .T = i32) & (C where .U = .T)]() {} +interface A { + let T:! Type; + let U:! Type; +} +constraint C { + extends A where .T = .U and .U = i32; +} +constraint D { + extends A where .T == .U and .U == i32; +} +interface B { + // This constraint is equivalent to + // `impl as A where .T == .U and .U == i32;` or + // `impl as C;` or `impl as D;`. + impl as A where .T = .U and .U = i32; +} + +var n: i32; + +// ❌ No implicit conversion from `i32` to `T.(A.T)`. +// Resolved constraint on `T` is +// `B where T.(A.T) == T.(A.U) and T.(A.U) = i32`. +// `T.(A.T)` is single-step equal to `T.(A.U)`, and +// `T.(A.U)` is single-step equal to `i32`, but +// `T.(A.T)` is not single-step equal to `i32`. +fn F(T:! B) -> T.(A.T) { return n; } ``` -In more detail, what happens here is: +### Constraint resolution + +When a constraint is applied to a type, the constraint is _resolved_. This +happens: + +- When the constraint is used explicitly, when declaring a generic parameter + or associated constant of the form `T:! Constraint`. +- When declaring an `impl` such as `impl T as Constraint`. + +In each case, the following steps are performed to resolve the constraint value +into a set of constraints on `T`: -1. The constraint is fully elaborated. `.X` is replaced with `.Self.X`. For - example, `C where .T = .U and .U = i32` is rewritten to - `C where .T = .Self.U and .U = i32`. -2. When the constraint is applied to a type `V`, the substitution `.Self = V` - is performed. The type of type used for `V` is - `C where .T = and .U = `. -3. During this substitution, if either `` rewrite rule would be used, - the result is an error. Substitution into `.Self.U` produces `V.U`, giving - an error. +- If multiple rewrites are specified for the same associated constant, they + are required to be identical, and duplicates are discarded. +- Rewrites are performed on other rewrites in order to find a fixed point, + where no rewrite applies within any other rewrite. If no fixed point + exists, the generic parameter declaration or `impl` declaration is invalid. +- Rewrites are performed throughout the other constraints in the constraint + value -- that is, in any `==` constraints and `is` constraints, and the + type `.Self` is replaced by `T` throughout the constraint. -This is explained further [below](#mutual-and-self-recursion). +``` +// ✅ `.T` is rewritten to `i32` when initially +// forming the constraint. +// Nothing to do during constraint resolution. +fn InOrder[A:! C where .T = i32 and .U = .T]() {} +// ✅ Constraint has `.T = .U` before constraint resolution. +// That rewrite is resolved to `.T = i32`. +fn Reordered[A:! C where .T = .U and .U = i32]() {} +// ✅ Constraint has `.U = .T` before constraint resolution. +// That rewrite is resolved to `.U = i32`. +fn ReorderedIndirect[A:! (C where .T = i32) & (C where .U = .T)]() {} +// ❌ Constraint resolution fails because +// no fixed point of rewrites exists. +fn Cycle[A:! C where .T = .U and .U = .T]() {} +``` + +To find a fixed point, we can perform rewrites on other rewrites, cycling +between them until they don't change or until a rewrite would apply to itself. +In the latter case, we have found a cycle and can reject the constraint. Note +that it's not sufficient to expand only a single rewrite until we hit this +condition: + +``` +// ❌ Constraint resolution fails because +// no fixed point of rewrites exists. +// If we only expand the right-hand side of `.T`, +// we find `.U`, then `.U*`, then `.U**`, and so on, +// and never detect a cycle. +// If we alternate between them, we find +// `.T = .U*`, then `.U = .U**`, then `.V = .U***`, +// then `.T = .U**`, then detect that the `.U` rewrite +// would apply to itself. +fn IndirectCycle[A:! C where .T = .U and .U = .V* and .V = .U*](); +``` + +After constraint resolution, no references to rewritten associated constants +remain in the constraint. + +If a constraint is never used, it is never subject to constraint resolution, +and it is possible for a constraint to be formed for which constraint +resolution would always fail. For example: + +``` +package Broken api; + +interface X { + let T:! Type; + let U:! Type; +} +let Bad:! auto = (X where .T = .U) & (X where .U = .T); +// Bad is not used here. +``` + +In such cases, the constraint `Broken.Bad` is not usable: any attempt to use +that constraint to constrain a type would perform constraint resolution, which +would always fail because it would discover a cycle between the rewrites for +`.T` and for `.U`. In order to ensure that such cases are diagnosed, a trial +constraint resolution is performed for all constraints. Note that this trial +resolution can be skipped for constraints that are actually used, which is the +common case. ### Precise rules and termination @@ -603,96 +720,20 @@ fn F2[T:! B where .U = i32](x: T) { } ``` -#### Mutual and self recursion - -Some cases need special attention. - -One such case is the mutual recursion case `C where .T = .U and .U = .T`. This -is interpreted as `(C where .T = .U) where .U = .T`, which is rewritten as -`(C where .T = .U) where .U = .U` because the type of `.Self` in the `.U = .T` -constraint is `C where .T = .U`. Indeed, all mutual recursion cases in a single -`where` clause end up reducing to self-recursion -- after substituting all -previous rewrites into the final rewrite in the cycle, it must refer to itself. - -Another case needing special attention is self-recursion, which mutual recursion -in the same `where` clause reduces to. Specifically, for cases of the form -`C where .T = F(.T)` for arbitrary `F`. A third case is mutual recursion where -the constraints do not come from the same `where` clause, such as in -`(C where .T = .U) & (C where .U = .T)`. - -To handle these cases, we need to specify what happens when a constraint is -applied to a type. In any context where a constraint is applied, a substitution -is performed, substituting the `.Self` of the constraint for the type to which -it is applied: - -``` -interface HasAssoc { - let Assoc:! Type; -} -// The fully-elaborated declared type of `T` is -// `HasAssoc where .Self.(HasAssoc.Assoc) is AddableWith(.Self)`. -fn F[T:! HasAssoc where .Assoc is AddableWith(.Self)]() { - // To form the type of `T`, we substitute `.Self = T` - // into its declared type, producing - // `HasAssoc where T.(HasAssoc.Assoc) is AddableWith(T)`. - let x: T; -} -``` - -In the case where the value being substituted for `.Self` has rewrites, we need -to specify what that set of rewrites is, as input to the substitution process. -But the precise values aren't known yet, because we've not yet substituted into -them. Instead, we perform this step with rewrite rules that map to an "error" -marker. That is, if any of the rewrites from the type are applied during -substitution, we consider the substitution to be invalid. - -``` -fn F[U:! C where .T = .T](x: U) { - var v: U.T; -} -``` - -Here, the elaborated declared type of `U` is `C where .(C.T) = .Self.(C.T)`. But -that’s not the type of the expression `U` – to get that type, we need to -substitute `.Self = U` into the declared type. That substitution is performed -with the rewrite rule `U.(C.T) = `, and when substituting into the right -hand side of the `=` constraint, we attempt to perform that rewrite, and reject -this program. - -It is possible for a constraint to be formed that is not used in the same -library, and for which a `.Self` substitution would always fail. For example: - -``` -package Broken api; - -interface X { - let T:! Type; - let U:! Type; -} -let Bad:! auto = (X where .T = .U) & (X where .U = .T); -// Bad is not used here. -``` - -In such cases, the constraint `Broken.Bad` is not usable: any attempt to use -that constraint to constrain a type would perform a `.Self` substitution, which -would always fail because it would use a rewrite that's not yet available. In -order to ensure that such cases are diagnosed, a trial substitution of a -placeholder type for `.Self` is performed for all constraints. Note that this -trial substitution can be skipped for constraints that are actually used, which -is the common case. - #### Examples ``` interface Container { let Element:! Type; - let Slice:! Container where .Element = Element; +} +interface SliceableContainer { + extends Container; + let Slice:! Container where .Element = Self.(Container.Element); } // ❌ Qualified name lookup rewrites this constraint to -// `Container where .Element = .Self.Element`. -// Substituting `.Self = T` with rewrite `.Element = ` -// results in an error because it forms `T.Element`. -fn Bad[T:! Container where .Element = .Slice.Element](x: T.Element) {} +// `SliceableContainer where .(Container.Element) = .Self.(Container.Element)`. +// Constraint resolution rejects this because this rewrite forms a cycle. +fn Bad[T:! SliceableContainer where .Element = .Slice.Element](x: T.Element) {} ``` ``` @@ -708,20 +749,25 @@ interface Example { fn Allowed(T:! Example, x: T.C.D); // ❌ But combined with another rewrite, creates an infinite loop. // `.C.D` is rewritten to `.B`, resulting in `where .B = .B`, -// which causes an error during `.Self` substitution. +// which causes an error during constraint resolution. // Using `==` instead of `=` would make this constraint redundant, // rather than it being an error. fn Error(T:! Example where .B = .C.D, x: T.C.D); ``` ``` +interface Allowed; +interface AllowedBase { + let A:! Allowed; +} interface Allowed { - let A:! Allowed where .A = .Self; + extends AllowedBase where .A = .Self; } -// ✅ The type of `x` is `T.A`. -// In `(T.A.A).A`, the inner `T.A.A` is rewritten to `T.A`, -// resulting in `(T.A).A`, which is then rewritten to `T.A`. -fn F(T:! Allowed, x: T.A.A.A); +// ✅ The type of `x` is `T`. +// In `((T.A).A).A`, the inner `T.A` is rewritten to `T`, +// resulting in `((T).A).A`, which is then rewritten to +// `(T).A`, which is then rewritten to `T`. +fn F(T:! Allowed, x: ((T.A).A).A); ``` ``` @@ -799,8 +845,7 @@ built-in implementation of `ImplicitAs`: ``` final impl forall [T:! Type, U:! Type where .Self == T] T as ImplicitAs(U) { - @Carbon.Builtin("implicit_as_same_type") - fn Convert[me: Self](other: U) -> U; + fn Convert[me: Self](other: U) -> U { ... } } ``` @@ -826,6 +871,35 @@ cannot pick a common type when given two distinct type expressions, even if we know they evaluate to the same type, because we would not know which API the result should have. +#### Implementation of same-type `ImplicitAs` + +It is possible to implement the above `impl` of `ImplicitAs` directly in +Carbon, without a compiler builtin, by taking advantage of the built-in +conversion between `C where .A = X` and `C where .A == X`: + +``` +interface EqualConverter { + let T:! Type; + fn Convert(t: T) -> Self; +} +fn EqualConvert[T:! Type](t: T, U:! EqualConverter where .T = T) -> U { + return U.Convert(t); +} +impl forall [U:! Type] U as EqualConverter where .T = U { + fn Convert(u: U) -> U { return u; } +} + +impl forall [T:! Type, U:! Type where .Self == T] T as ImplicitAs(U) { + fn Convert[self: Self]() -> U { return EqualConvert(self, U); } +} +``` + +The transition from `(T as ImplicitAs(U)).Convert`, where we know that `U == +T`, to `EqualConverter.Convert`, where we know that `.T = U`, allows a +same-type constraint to be used to perform a rewrite. + +This implementation is in use in Carbon Explorer. + ### Observe declarations Same-type constraints are non-transitive, just like other constraints such as @@ -883,6 +957,21 @@ understood by developers, and that the more-ergonomic `=` constraints cover sufficiently many scenarios that the developer seldom needs to write code to request a conversion between types that are known to be the same. +Ideally, we hope that`=` constraints will cover all real situations, and `==` +constraints will never need to be written, outside of the one `ImplicitAs` +implementation described above. If this turns out to be true in practice, we +should consider removing `==` syntax from the language. However, this will not +remove `==` semantics, both because of the behavior of `ImplicitAs` and because +`impl T as C where .A = B` constraints in an interface or named constraint give +rise to `==` semantics. + +### Implementation experience + +This proposal has been mostly implemented in Carbon Explorer, and appears to +behave as expected. However, Explorer does not yet support forward declarations +of interfaces and constraints, which means that some of the more interesting +cases can't be tested. + ## Rationale - [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem) @@ -959,27 +1048,6 @@ equalities, and picking any finite subset risks the result being non-transitive in general, if a different subset of equalities might be considered in a different type-checking context. -### Use a different rule for "same rewrite" - -We considered various rules for determining if two rewrites are using the same -rule. The goal is to select a rule that ensures there are only finitely many -different rewrites in a program, so that any non-termination will always be -detected, and to minimize the occurrence of false-positives. - -One candidate was to consider rewrites the same if they apply to the same -associated constant. However, it seems likely that the false positive rate for -such a rule would be unacceptably high. - -Another rule would be to consider rewrites the same if they correspond to the -same `=` token. This has the disadvantage that factoring out a common -subexpression can result in two constraints that were previously considered -different to now be considered the same, potentially breaking users. - -Connecting sameness to both the `=` token and the `:!` token is intended to -minimize the risk of refactorings introducing false positives, while retaining a -set of rewrite constraints that grows at most quadratically in the size of the -program, and in practice much slower than that. - ### Different names for same-type constraint We considered various options for the spelling of a same-type constraint: @@ -1026,106 +1094,75 @@ hard to judge the other arguments in favor of and against other operators. It would be reasonable to revisit this decision when we have more usage information. -### Topological ordering for `=` in `and` +### Required ordering for rewrites -When `.X` is used on the right-hand side of an `=` and on the left-hand side of -an `=` in the same `where` clause, we could reorder constraints into a -topological order: +We considered a rule where a rewrite would need to be specified in a constraint +before the associated constant is used. For example: ``` interface C { let T:! Type; let U:! Type; } -// Could be reordered to `C where .U = i32 and .T = .U`. +// Could reject, `.U` referenced before rewrite is defined. fn G[A:! C where .T = .U and .U = i32]() {} ``` -However, this is challenging to do in general: because the rewrite that we want -to fire in this case happens as part of qualified name lookup for `.U`, there's -a circularity: +This would remove the need to perform constraint resolution. Instead, we could +apply rewrites as we form the constraint, and constraints formed in this way +would always satisfy the property that they don't refer to associated constants +that have a rewrite. -- we need to do name lookup to find which associated constants of `.Self` are - named where, -- we need to know which associated constants of `.Self` are named where in - order to know how to form a topological order, and -- we need to form the topological order in order to know which rewrites to - apply when doing name lookup. +It may also be less surprising to use this rule. Because Carbon uses top-down +processing for type-checking in general, it may be unexpected that a rewrite +rule would rewrite an earlier occurrence of its rewrite target, even though +this happens in practice at a later point in time, during constraint +resolution. -This may be solvable by repeatedly performing name lookup and type-checking -until we converge, but because a rewrite during name lookup changes the _type_ -of the result as well as the value, this may consider intermediate states not -involved in the final result and doesn't seem guaranteed to terminate. +However, such an approach would not give a satisfactory outcome for cases such +as: -For example, consider `C where .T = .U.V and .U = i32`. The initial lookup for -`.V` in `.U.V` will fail when checked with `.U` of type `Type`, even though it -might succeed later if `i32` has a member of type `V`. And worse, for -`C where .T = F(.Self) and .U = G(.Self)` for arbitrary symbolically-evaluatable -functions `F` and `G`, whether `.T` refers to `.U` or the other way around can't -be seen without evaluating the function calls, and there's no guarantee that -doing so will ever converge. - -We could apply a topological sort only for the easy cases, where `.X` -syntactically appears after an `=`, and leave the rest of the cases to be -handled as in this proposal. That would add complexity and solve the majority of -cases, but would not solve the whole problem. +``` +constraint X { + extends C where .T = .U; +} +constraint Y { + extends C where .U = i32; +} +// Desired behavior is that `.T` and `.U` both rewrite to `i32`. +fn G[A:! X & Y]() -> A.T { return 0; } +``` -More fundamentally, given the top-down processing generally used in Carbon, it -would likely be more surprising than helpful for rewrites to apply to earlier -uses of the same name in this one case, but not in general. +Performing earlier-appearing rewrites in later constraints and additionally +performing a constraint resolution step to apply rewrites throughout the +constraint seems to give better outcomes for the examples we considered. ### Multi-constraint `where` clauses Instead of treating `A where B and C` as `(A where B) where C`, we could model it as `(A where B) & (A where C)`. -Specifically, this would treat `C where .T = .U and .U = .T` as -`(C where .T = .U) & (C where .U = .T)`, where the type of `.Self` would be `C` -in both clauses, with both constraints applied "simultaneously" to `C`. This -would mean that the order of clauses doesn’t matter. If we do this, then neither -right-hand side is rewritten by name lookup, and would reject cases such as -`C where .T = i32 and .U = .T`. This case seems reasonable, and we would prefer -to accept it. - -### Forward references in `.Self` substitution - -Instead of saying that a `.Self` substitution is performed with a set of "error" -rewrite rules, we could say that it's performed with no rewrite rules at all. -Given: - -``` -fn F[X:! C where .T = .T](x: X, y: X.T); -``` - -... we would elaborate the declared type of `X` as -`C where .(C.T) = .Self.(C.T)`, and substituting in `.Self = X` gives -`C where .(C.T) = X.(C.T)`, where the right-hand side of the rewrite rule is a -_symbolic_ reference to the associated constant `X.(C.T)`. Hence the expression -`X.T` would be rewritten by name lookup into that symbolic constant, and the -effect would be as if there is no rewrite rule at all, except that specifying a -different value for `.T` in some later constraint would be prohibited. This -seems like it might be desirable behavior. - -In more complex cases, however, the behavior ceases being desirable. Given: - +Specifically, given ``` -fn G[X:! (C where .T = .U) & (C where .U = .T)](x: X, t: X.T, u: X.U); +interface C { + let T:! Type; + let U:! Type; +} ``` +this would treat `C where .T = A and .U = B` as `(C where .T = A) & (C where .U += B)`, where the type of `.Self` would be `C` in both clauses, with both +constraints applied "simultaneously" to `C`. This would mean that the order of +clauses doesn’t matter. -... the type of the expression `X` after substituting `.Self = X` is -`C where .(C.T) = .Self.(C.U) and .(C.U) = .Self.(C.T)`. So the type of `t` is -the symbolic type `X.U` and the type of `u` is the different symbolic type -`X.T`. - -Similarly, for: - -``` -fn H[X:! C where .T = .U and .U = i32](x: X, t: X.T, u: X.U); -``` +However, if we do this, then neither right-hand side is rewritten by name +lookup, and thus we would reject cases such as `C where .T = i32 and .U = +.T.(Add.Result)`. This case seems reasonable, and we would prefer to accept it. -... we find that the type of `u` is `i32` as desired, but the type of `t` is the -symbolic type `X.U`. Reversing the order of the `and` operands gives very -different behavior. +Therefore, the rule we select is that, as soon as a rewrite is declared, it is +in scope, and later references to that associated constant refer to the +rewritten value with its rewritten type. We still reject cases like `C where .U += .T.(Add.Result) and .T = i32`, because we do not know that `T is Add` from +the declared type of `C.T`. -Fundamentally, this choice allows violations of -[property 1](#precise-rules-and-termination), which seems highly undesirable. +The same rule applies to `is` constraints: we accept `C where .T is Add and .U += .T.(Add.Result)`, but reject `C where .U = .T.(Add.Result) and .T is Add`. From 4801b9c42284e2ea75953f047bb253caa70c0167 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Dec 2022 17:13:05 -0800 Subject: [PATCH 11/19] Updates based on review feedback. Precommit. --- proposals/p2173.md | 130 +++++++++++++++++++++++++++------------------ 1 file changed, 79 insertions(+), 51 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 45765689dd733..8c0904c7c5cfd 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -24,26 +24,29 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Rewrite constraints](#rewrite-constraints) - [Combining constraints with `&`](#combining-constraints-with-) - [Combining constraints with `and`](#combining-constraints-with-and) + - [Combining constraints with `extends`](#combining-constraints-with-extends) + - [Combining constraints with `impl as`](#combining-constraints-with-impl-as) + - [Constraint resolution](#constraint-resolution) - [Precise rules and termination](#precise-rules-and-termination) - [Qualified name lookup](#qualified-name-lookup) - [Type substitution](#type-substitution) - - [Mutual and self recursion](#mutual-and-self-recursion) - [Examples](#examples) - [Termination](#termination) - [Same type constraints](#same-type-constraints) + - [Implementation of same-type `ImplicitAs`](#implementation-of-same-type-implicitas) - [Observe declarations](#observe-declarations) - [Implementing an interface](#implementing-an-interface) - [Ergonomics](#ergonomics) + - [Implementation experience](#implementation-experience) - [Rationale](#rationale) - [Alternatives considered](#alternatives-considered) - [Status quo](#status-quo) - [Restrict constraints to allow computable type equality](#restrict-constraints-to-allow-computable-type-equality) - [Find a fully transitive approach to type equality](#find-a-fully-transitive-approach-to-type-equality) - - [Use a different rule for "same rewrite"](#use-a-different-rule-for-same-rewrite) - - [Different names for same-type constraint](#different-names-for-same-type-constraint) - - [Topological ordering for `=` in `and`](#topological-ordering-for--in-and) + - [Different syntax for rewrite constraint](#different-syntax-for-rewrite-constraint) + - [Different syntax for same-type constraint](#different-syntax-for-same-type-constraint) + - [Required ordering for rewrites](#required-ordering-for-rewrites) - [Multi-constraint `where` clauses](#multi-constraint-where-clauses) - - [Forward references in `.Self` substitution](#forward-references-in-self-substitution) @@ -274,14 +277,14 @@ fn Add[T:! Container where .Element = i32](p: T*, y: T.Slice.Element) { Rewrites aren't performed on the left-hand side of such an `=`, so `where .A = .B and .A = C` is not rewritten to `where .A = .B and .B = C`. Instead, such a `where` clause is invalid when the constraint is -[resolved](#constraint-resolution) unless each rule for `.A` specifies the -same rewrite. +[resolved](#constraint-resolution) unless each rule for `.A` specifies the same +rewrite. Note that `T:! C where .R = i32` can result in a type `T.R` whose behavior is different from the behavior of `T.R` given `T:! C`. For example, member lookup into `T.R` can find different results and operations can therefore have -different behavior. However, this does not violate coherence because `C` and `C -where .R = i32` don't differ by merely having more type information; rather, +different behavior. However, this does not violate coherence because `C` and +`C where .R = i32` don't differ by merely having more type information; rather, they are different types that have an isomorphic set of values, somewhat like `i32` and `u32`. An `=` constraint is not merely learning a new fact about a type, it is requesting different behavior. @@ -312,10 +315,10 @@ proposal suggests a simpler rule: ### Combining constraints with `and` -It's possible for one `=` constraint in a `where` to refer to another. When -this happens, the constraint `C where A and B` is interpreted as `(C where A) -where B`, so rewrites in `A` are applied immediately to names in `B`, but -rewrites in `B` are not applied to names in `A` until the constraint is +It's possible for one `=` constraint in a `where` to refer to another. When this +happens, the constraint `C where A and B` is interpreted as +`(C where A) where B`, so rewrites in `A` are applied immediately to names in +`B`, but rewrites in `B` are not applied to names in `A` until the constraint is [resolved](#constraint-resolution): ``` @@ -336,8 +339,8 @@ fn F[A:! C where .U = .T.Me and .T = M]() {} ### Combining constraints with `extends` -Within an interface or named constraint, `extends` can be used to extend -a constraint that has rewrites. +Within an interface or named constraint, `extends` can be used to extend a +constraint that has rewrites. ``` interface A { @@ -407,11 +410,11 @@ into a set of constraints on `T`: - If multiple rewrites are specified for the same associated constant, they are required to be identical, and duplicates are discarded. - Rewrites are performed on other rewrites in order to find a fixed point, - where no rewrite applies within any other rewrite. If no fixed point - exists, the generic parameter declaration or `impl` declaration is invalid. + where no rewrite applies within any other rewrite. If no fixed point exists, + the generic parameter declaration or `impl` declaration is invalid. - Rewrites are performed throughout the other constraints in the constraint - value -- that is, in any `==` constraints and `is` constraints, and the - type `.Self` is replaced by `T` throughout the constraint. + value -- that is, in any `==` constraints and `is` constraints, and the type + `.Self` is replaced by `T` throughout the constraint. ``` // ✅ `.T` is rewritten to `i32` when initially @@ -451,9 +454,9 @@ fn IndirectCycle[A:! C where .T = .U and .U = .V* and .V = .U*](); After constraint resolution, no references to rewritten associated constants remain in the constraint. -If a constraint is never used, it is never subject to constraint resolution, -and it is possible for a constraint to be formed for which constraint -resolution would always fail. For example: +If a constraint is never used, it is never subject to constraint resolution, and +it is possible for a constraint to be formed for which constraint resolution +would always fail. For example: ``` package Broken api; @@ -781,7 +784,7 @@ interface MoveYsRight { constraint ForwardDeclaredConstraint(X:! MoveYsRight) { extends MoveYsRight where .X = X.Y; } -// ✅ The type of `x` is `T.X.Y.Y`. +// ✅ The final type of `x` is `T.X.Y.Y`. It is computed as follows: // The type of `T` is `MoveYsRight`. // The type of `T.Y` is determined as follows: // - Qualified name lookup finds `MoveYsRight.Y`. @@ -818,11 +821,10 @@ terminating type canonicalization mechanism for associated constants: in `A.B`, if the type of `A` specifies that `.B == C`, then `A.B` is replaced by `C`. Equality of types constrained in this way is transitive. -However, some existing forms of non-termination may remain, such as substitution -into a named constraint recursively triggering another such substitution, or -template instantiation triggering another template instantiation. Such cases -will need to be detected and handled in some way, such as by a depth limit, but -doing so doesn't compromise the soundness of the type system. +However, some existing forms of non-termination may remain, such as template +instantiation triggering another template instantiation. Such cases will need to +be detected and handled in some way, such as by a depth limit, but doing so +doesn't compromise the soundness of the type system. ### Same type constraints @@ -873,9 +875,9 @@ result should have. #### Implementation of same-type `ImplicitAs` -It is possible to implement the above `impl` of `ImplicitAs` directly in -Carbon, without a compiler builtin, by taking advantage of the built-in -conversion between `C where .A = X` and `C where .A == X`: +It is possible to implement the above `impl` of `ImplicitAs` directly in Carbon, +without a compiler builtin, by taking advantage of the built-in conversion +between `C where .A = X` and `C where .A == X`: ``` interface EqualConverter { @@ -894,9 +896,9 @@ impl forall [T:! Type, U:! Type where .Self == T] T as ImplicitAs(U) { } ``` -The transition from `(T as ImplicitAs(U)).Convert`, where we know that `U == -T`, to `EqualConverter.Convert`, where we know that `.T = U`, allows a -same-type constraint to be used to perform a rewrite. +The transition from `(T as ImplicitAs(U)).Convert`, where we know that `U == T`, +to `EqualConverter.Convert`, where we know that `.T = U`, allows a same-type +constraint to be used to perform a rewrite. This implementation is in use in Carbon Explorer. @@ -978,7 +980,7 @@ cases can't be tested. - This rule is easy to implement, and even a naive implementation should be efficient. - [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) - - The status quo causes coincidentally-equal types to have the same + - The status quo causes coincidentally equal types to have the same interface. Under this proposal, the interface of a type is not affected by coincidental equality, only by intentional assignment to an associated type. @@ -1042,13 +1044,36 @@ type-checking step as a set of non-quantified equalities. The resulting system of equalities can then be [solved efficiently](https://dl.acm.org/doi/pdf/10.1145/322186.322198) to determine whether two types are known to be the same. This approach appears -sufficient to cope with locally-declared constraints, but when constraints +sufficient to cope with locally declared constraints, but when constraints appear within interfaces, they can give rise to an infinite family of equalities, and picking any finite subset risks the result being non-transitive in general, if a different subset of equalities might be considered in a different type-checking context. -### Different names for same-type constraint +### Different syntax for rewrite constraint + +We could use a different syntax, such as `~>`, for rewrite constraints. This +would have the advantage of not using assignment for an operation that's not +always doing something that developers will think of as assignment. It also +avoids hard-to read code in cases where a binding has an initializer: + +``` +// This proposal. +let T:! Add where .Result = i32 = i32; +// Alternative syntax. +let T:! Add where .Result ~> i32 = i32; +``` + +However, it seems valuable to use only a single syntax for specifying these +constraints, and equality is the appropriate mental model most of the time. +Moreover, in an `impl` declaration, we really are assigning to the associated +constant in the sense of setting a value, rather than merely constraining a +value. + +This decision should be revisited if a superior syntax that avoids the above +visual ambiguity is suggested. + +### Different syntax for same-type constraint We considered various options for the spelling of a same-type constraint: @@ -1115,9 +1140,8 @@ that have a rewrite. It may also be less surprising to use this rule. Because Carbon uses top-down processing for type-checking in general, it may be unexpected that a rewrite -rule would rewrite an earlier occurrence of its rewrite target, even though -this happens in practice at a later point in time, during constraint -resolution. +rule would rewrite an earlier occurrence of its rewrite target, even though this +happens in practice at a later point in time, during constraint resolution. However, such an approach would not give a satisfactory outcome for cases such as: @@ -1143,26 +1167,30 @@ Instead of treating `A where B and C` as `(A where B) where C`, we could model it as `(A where B) & (A where C)`. Specifically, given + ``` interface C { let T:! Type; let U:! Type; } ``` -this would treat `C where .T = A and .U = B` as `(C where .T = A) & (C where .U -= B)`, where the type of `.Self` would be `C` in both clauses, with both -constraints applied "simultaneously" to `C`. This would mean that the order of -clauses doesn’t matter. + +this would treat `C where .T = A and .U = B` as +`(C where .T = A) & (C where .U = B)`, where the type of `.Self` would be `C` in +both clauses, with both constraints applied "simultaneously" to `C`. This would +mean that the order of clauses doesn’t matter. However, if we do this, then neither right-hand side is rewritten by name -lookup, and thus we would reject cases such as `C where .T = i32 and .U = -.T.(Add.Result)`. This case seems reasonable, and we would prefer to accept it. +lookup, and thus we would reject cases such as +`C where .T = i32 and .U = .T.(Add.Result)`. This case seems reasonable, and we +would prefer to accept it. Therefore, the rule we select is that, as soon as a rewrite is declared, it is in scope, and later references to that associated constant refer to the -rewritten value with its rewritten type. We still reject cases like `C where .U -= .T.(Add.Result) and .T = i32`, because we do not know that `T is Add` from -the declared type of `C.T`. +rewritten value with its rewritten type. We still reject cases like +`C where .U = .T.(Add.Result) and .T = i32`, because we do not know that +`T is Add` from the declared type of `C.T`. -The same rule applies to `is` constraints: we accept `C where .T is Add and .U -= .T.(Add.Result)`, but reject `C where .U = .T.(Add.Result) and .T is Add`. +The same rule applies to `is` constraints: we accept +`C where .T is Add and .U = .T.(Add.Result)`, but reject +`C where .U = .T.(Add.Result) and .T is Add`. From fcbec2a1b91b5d1bb5ace7e661721709604c05b8 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Dec 2022 17:14:28 -0800 Subject: [PATCH 12/19] Fix examples --- proposals/p2173.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 8c0904c7c5cfd..9cff7bb26e8f2 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -79,8 +79,8 @@ interface X { let R:! Hashable; fn Make() -> R; } -fn F[T:! X where .R == i32]() -> i32 { - return F.Make() + 1; +fn F[T:! X where .R == i32](x: T) -> i32 { + return T.Make() + 1; } ``` @@ -96,7 +96,7 @@ whose type is either of those type expressions provides the (disjoint) union of their interfaces. This leads to a surprise: ``` -fn G[T:! X where .R == i32]() -> u128 { +fn G[T:! X where .R == i32](x: T) -> u128 { let n: i32 = 1; let m: i64 = 2; return n.Hash() ^ m.Hash(); @@ -632,7 +632,7 @@ interface IfaceHasX { class ClassHasX { class X {} } -class HasAssoc { +interface HasAssoc { let Assoc:! IfaceHasX; } From b2e59e811b652b748627b78ea6a01c5b23813eca Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Dec 2022 18:05:54 -0800 Subject: [PATCH 13/19] Apply suggestions from code review Co-authored-by: josh11b --- proposals/p2173.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 9cff7bb26e8f2..d52c56c102a4b 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -766,7 +766,7 @@ interface AllowedBase { interface Allowed { extends AllowedBase where .A = .Self; } -// ✅ The type of `x` is `T`. +// ✅ The final type of `x` is `T`. It is computed as follows: // In `((T.A).A).A`, the inner `T.A` is rewritten to `T`, // resulting in `((T).A).A`, which is then rewritten to // `(T).A`, which is then rewritten to `T`. @@ -959,7 +959,7 @@ understood by developers, and that the more-ergonomic `=` constraints cover sufficiently many scenarios that the developer seldom needs to write code to request a conversion between types that are known to be the same. -Ideally, we hope that`=` constraints will cover all real situations, and `==` +Ideally, we hope that `=` constraints will cover all real situations, and `==` constraints will never need to be written, outside of the one `ImplicitAs` implementation described above. If this turns out to be true in practice, we should consider removing `==` syntax from the language. However, this will not From 5127c261e8bd1ae313882dca06eb67548852ca7f Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Dec 2022 13:57:05 -0800 Subject: [PATCH 14/19] Update proposals/p2173.md --- proposals/p2173.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index d52c56c102a4b..54e8c4c002220 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -417,7 +417,7 @@ into a set of constraints on `T`: `.Self` is replaced by `T` throughout the constraint. ``` -// ✅ `.T` is rewritten to `i32` when initially +// ✅ `.T` in `.U = .T` is rewritten to `i32` when initially // forming the constraint. // Nothing to do during constraint resolution. fn InOrder[A:! C where .T = i32 and .U = .T]() {} From af7fc9c4be39d4ad7d26e809fa96fe66bd553105 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Dec 2022 16:44:13 -0800 Subject: [PATCH 15/19] Add alternative considered for treatment of `impl as`. --- proposals/p2173.md | 95 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) diff --git a/proposals/p2173.md b/proposals/p2173.md index 54e8c4c002220..9a08fcb384565 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -47,6 +47,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Different syntax for same-type constraint](#different-syntax-for-same-type-constraint) - [Required ordering for rewrites](#required-ordering-for-rewrites) - [Multi-constraint `where` clauses](#multi-constraint-where-clauses) + - [Rewrite constraints in `impl as` constraints](#rewrite-constraints-in-impl-as-constraints) @@ -1194,3 +1195,97 @@ rewritten value with its rewritten type. We still reject cases like The same rule applies to `is` constraints: we accept `C where .T is Add and .U = .T.(Add.Result)`, but reject `C where .U = .T.(Add.Result) and .T is Add`. + +### Rewrite constraints in `impl as` constraints + +A rewrite constraint can appear, directly or indirectly, in an `impl as` +constraint in an interface or named constraint: + +``` +interface A { + let T:! Type; +} +constraint AInt { + extends A where .T = i32; +} +interface B { + // Or, `impl as A where .T = i32;` + impl as AInt; +} +``` + +When this happens, the rewrite constraint does not result in rewrites being +performed when the associated constant is mentioned in a member access into that +interface or named constraint: + +``` +// Return type is not rewritten to `i32`, +// but there is still a same-type constraint. +fn F(T:! B) -> T.(A.T) { + // OK, `i32` implicitly converts to `T.(A.T)`. + return 0 as i32; +} +``` + +This may be surprising: `B` says that its `.(A.T)` has a rewrite to `i32`, but +that rewrite is not performed. In contrast, rewrites in an `extends` constraint +do become part of the enclosing interface or named constraint. + +``` +interface C { + extends AInt; +} +// Return type is rewritten to `i32`. +fn G(T:! C) -> T.T; +``` + +Something similar happens with `impl T as` constraints. The following two +constructs are not equivalent: + +``` +interface A { + let X:! Add where .Result = i32; +} +interface A { + let X:! Add; + impl X as Add where .Result = i32; +} +``` + +In the former case, references to `A.X.Result` are rewritten to `i32`, and in +the latter case, they are not, because the rewrite is not part of the type of +`X`. + +The general principle here is that rewrites are part of the declared type of a +name, and can't be changed after the fact by another declaration. For the +`impl T as` case, this behavior is also a necessary part of the termination +argument. If rewrites from `impl T as` constraints were used, it would be +possible to form a rewrite cycle: + +``` +interface C { let X:! Type; } +interface A { + let U:! C; + let T:! C; +} +interface B1 { + extends A; + impl T as C where .X = U.X; +} +interface B2 { + extends A; + impl U as C where .X = T.X; +} +// `V.T.X` ~> `V.U.X` ~> `V.T.X` ~> ... +fn F(V:! B1 & B2) -> V.T.X; +``` + +We could allow the specific case of `impl as` (not `impl T as`) to introduce +rewrites. The advantage of this change is that this behavior may be less +surprising in some cases. However, this would mean that an `impl as` sometimes +changes the interface of the enclosing constraint, and behaves inconsistently +from an `impl T as` constraint, so this proposal does not adopt that behavior. + +We could allow `impl T as` to introduce rewrites for `T` in general, but we +would need to find some way to fix the resulting holes in the termination +argument. From 03b5cebd50a70bb1fbae26f52abc45774535ec9d Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 3 Jan 2023 14:24:46 -0800 Subject: [PATCH 16/19] Update terminology to match #2360 and address some minor review comments. --- proposals/p2173.md | 179 +++++++++++++++++++++++---------------------- 1 file changed, 93 insertions(+), 86 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index 9a08fcb384565..dedba16f09052 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -190,23 +190,23 @@ There has been a lot of work in this space. Some previous relevant proposals: and `==` where constraints, which are similar to, but somewhat different from, those in this proposal. - `=` constraints require a concrete type on their right-hand side. The - left-hand side is not restricted. The type-of-type of the left-hand side - is changed to match the right-hand side. - - `==` constraints do not affect the type-of-type of either operand. + left-hand side is not restricted. The facet type of the left-hand side + type is changed to match the right-hand side. + - `==` constraints do not affect the facet type of either operand. - As an answer to the question of how to produce a decidable type equality rule, both kinds of constraints are applied automatically, but only at a depth of one constraint. Neither form is transitive but both can be transitively extended using `observe` declarations. - [#2070: always `==` not `=` in `where` clauses](https://github.com/carbon-language/carbon-lang/pull/2070) describes an attempt to unify the syntax and semantics of `=` and `==` - constraints, under which we would always merge the type-of-types of the + constraints, under which we would always merge the facet types of the operands. That proposal has not yet been accepted, and this proposal intends to supersede it. ## Proposal -Replace the existing `where A == B` constraint with two different kinds of -constraint: +Replace the semantics of the existing `where A == B` and `where A = B` +constraints as follows: - `where .A = T` specifies a _rewrite_ constraint. If the associated constant `.A` is named as a member of a type with this constraint, it is rewritten to @@ -219,12 +219,15 @@ constraint: Moreover, if `F` is any pure type function, `T is SameAs(U)` implies `F(T) is SameAs(F(U)`). `observe` statements can be used to form transitive `SameAs` relations. Same-type constraints are not used automatically by the - language rules for any purpose, but there is an `ImplicitAs(T)` impl for - types implementing `SameAs(T)`. + language rules for any purpose, but a blanket `ImplicitAs` impl permits + conversions between types that are known to be the same: + ``` + impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U); + ``` A type implements `C where .A = T` if and only if it implements `C where .A == T`, assuming both constraints are valid – the inhabitants of -these two type-of-types are the same, but they are different type-of-types and +these two facet types are the same, but they are different facet types and provide different interfaces for corresponding type values. ## Details @@ -237,10 +240,10 @@ the name of an associated constant. `.Self` is not permitted. ``` interface RewriteSelf { // ❌ Error: `.Self` is not the name of an associated constant. - let Me:! Type where .Self = Self; + let Me:! type where .Self = Self; } interface HasAssoc { - let Assoc:! Type; + let Assoc:! type; } interface RewriteSingleLevel { // ✅ Uses of `A.Assoc` will be rewritten to `i32`. @@ -255,12 +258,12 @@ interface RewriteMultiLevel { This notation is permitted anywhere a constraint can be written, and results in a new constraint with a different interface: the named member effectively no longer names an associated constant of the constrained type, and is instead -treated as a rewrite rule the expands to the right-hand side of the constraint, +treated as a rewrite rule that expands to the right-hand side of the constraint, with any mentioned parameters substituted into that type. ``` interface Container { - let Element:! Type; + let Element:! type; let Slice:! Container where .Element = Element; fn Add[addr me: Self*](x: Element); } @@ -284,25 +287,25 @@ rewrite. Note that `T:! C where .R = i32` can result in a type `T.R` whose behavior is different from the behavior of `T.R` given `T:! C`. For example, member lookup into `T.R` can find different results and operations can therefore have -different behavior. However, this does not violate coherence because `C` and -`C where .R = i32` don't differ by merely having more type information; rather, -they are different types that have an isomorphic set of values, somewhat like -`i32` and `u32`. An `=` constraint is not merely learning a new fact about a -type, it is requesting different behavior. +different behavior. However, this does not violate coherence because the facet +types `C` and `C where .R = i32` don't differ by merely having more type +information; rather, they are different facet types that have an isomorphic set +of values, somewhat like `i32` and `u32`. An `=` constraint is not merely +learning a new fact about a type, it is requesting different behavior. ### Combining constraints with `&` Suppose we have `X = C where .R = A` and `Y = C where .R = B`. What should `C & X` produce? What should `X & Y` produce? -We could perhaps say that `X & Y` results in a constraint where the type of `R` +We could perhaps say that `X & Y` results in a facet type where the type of `R` has the union of the interface of `A` and the interface of `B`, and that `C & X` -similarly results in a constraint where the type of `R` has the union of the +similarly results in a facet type where the type of `R` has the union of the interface of `A` and the interface originally specified by `C`. However, this proposal suggests a simpler rule: - Combining two rewrite rules with different rewrite targets results in a - constraint where the associated constant is ambiguous. Given `T:! X & Y`, + facet type where the associated constant is ambiguous. Given `T:! X & Y`, the type expression `T.R` is ambiguous between a rewrite to `A` and a rewrite to `B`. But given `T:! X & X`, `T.R` is unambiguously rewritten to `A`. @@ -314,19 +317,22 @@ proposal suggests a simpler rule: `(Container & Iterable) where .Element = i32` which is the same as the constraint `Container where .Element = i32`. +If the rewrite for an associated constant is ambiguous, the facet type is +rejected during [constraint resolution](#constraint-resolution). + ### Combining constraints with `and` It's possible for one `=` constraint in a `where` to refer to another. When this -happens, the constraint `C where A and B` is interpreted as +happens, the facet type `C where A and B` is interpreted as `(C where A) where B`, so rewrites in `A` are applied immediately to names in -`B`, but rewrites in `B` are not applied to names in `A` until the constraint is +`B`, but rewrites in `B` are not applied to names in `A` until the facet type is [resolved](#constraint-resolution): ``` interface C { - let T:! Type; - let U:! Type; - let V:! Type; + let T:! type; + let U:! type; + let V:! type; } class M { alias Me = Self; @@ -334,7 +340,7 @@ class M { // ✅ Same as `C where .T = M and .U = M.Me`, which is // the same as `C where .T = M and .U = M`. fn F[A:! C where .T = M and .U = .T.Me]() {} -// ❌ No member `Me` in `A.T:! Type`. +// ❌ No member `Me` in `A.T:! type`. fn F[A:! C where .U = .T.Me and .T = M]() {} ``` @@ -345,8 +351,8 @@ constraint that has rewrites. ``` interface A { - let T:! Type; - let U:! Type; + let T:! type; + let U:! type; } interface B { extends A where .T = .U and .U = i32; @@ -369,8 +375,8 @@ are equivalent to `==` constraints. ``` interface A { - let T:! Type; - let U:! Type; + let T:! type; + let U:! type; } constraint C { extends A where .T = .U and .U = i32; @@ -398,34 +404,35 @@ fn F(T:! B) -> T.(A.T) { return n; } ### Constraint resolution -When a constraint is applied to a type, the constraint is _resolved_. This -happens: +When a facet type is used as the declared type of a type `T`, the constraints +that were specified within that facet type are _resolved_ to determine the +constraints that apply to `T`. This happens: - When the constraint is used explicitly, when declaring a generic parameter or associated constant of the form `T:! Constraint`. - When declaring an `impl` such as `impl T as Constraint`. -In each case, the following steps are performed to resolve the constraint value -into a set of constraints on `T`: +In each case, the following steps are performed to resolve the facet type's +abstract constraints into a set of constraints on `T`: - If multiple rewrites are specified for the same associated constant, they are required to be identical, and duplicates are discarded. - Rewrites are performed on other rewrites in order to find a fixed point, where no rewrite applies within any other rewrite. If no fixed point exists, the generic parameter declaration or `impl` declaration is invalid. -- Rewrites are performed throughout the other constraints in the constraint - value -- that is, in any `==` constraints and `is` constraints, and the type +- Rewrites are performed throughout the other constraints in the facet type -- + that is, in any `==` constraints and `is` constraints -- and the type `.Self` is replaced by `T` throughout the constraint. ``` // ✅ `.T` in `.U = .T` is rewritten to `i32` when initially -// forming the constraint. +// forming the facet type. // Nothing to do during constraint resolution. fn InOrder[A:! C where .T = i32 and .U = .T]() {} -// ✅ Constraint has `.T = .U` before constraint resolution. +// ✅ Facet type has `.T = .U` before constraint resolution. // That rewrite is resolved to `.T = i32`. fn Reordered[A:! C where .T = .U and .U = i32]() {} -// ✅ Constraint has `.U = .T` before constraint resolution. +// ✅ Facet type has `.U = .T` before constraint resolution. // That rewrite is resolved to `.U = i32`. fn ReorderedIndirect[A:! (C where .T = i32) & (C where .U = .T)]() {} // ❌ Constraint resolution fails because @@ -453,29 +460,29 @@ fn IndirectCycle[A:! C where .T = .U and .U = .V* and .V = .U*](); ``` After constraint resolution, no references to rewritten associated constants -remain in the constraint. +remain in the constraints on `T`. -If a constraint is never used, it is never subject to constraint resolution, and -it is possible for a constraint to be formed for which constraint resolution -would always fail. For example: +If a facet type is never used to constrain a type, it is never subject to +constraint resolution, and it is possible for a facet type to be formed for +which constraint resolution would always fail. For example: ``` package Broken api; interface X { - let T:! Type; - let U:! Type; + let T:! type; + let U:! type; } let Bad:! auto = (X where .T = .U) & (X where .U = .T); // Bad is not used here. ``` -In such cases, the constraint `Broken.Bad` is not usable: any attempt to use -that constraint to constrain a type would perform constraint resolution, which +In such cases, the facet type `Broken.Bad` is not usable: any attempt to use +that facet type to constrain a type would perform constraint resolution, which would always fail because it would discover a cycle between the rewrites for `.T` and for `.U`. In order to ensure that such cases are diagnosed, a trial -constraint resolution is performed for all constraints. Note that this trial -resolution can be skipped for constraints that are actually used, which is the +constraint resolution is performed for all facet types. Note that this trial +resolution can be skipped for facet types that are actually used, which is the common case. ### Precise rules and termination @@ -506,7 +513,7 @@ apply only one rewrite in each of the above cases, satisfying property 2. Qualified name lookup into either a type parameter or into an expression whose type is a symbolic type `T` -- either a type parameter or an associated type -- -considers names from the constraint `C` on `T`, that is, from `T`’s declared +considers names from the facet type `C` of `T`, that is, from `T`’s declared type. ``` @@ -539,7 +546,7 @@ declared type. interface SelfIface { fn Get[me: Self]() -> Self; } -class UsesSelf(T:! Type) { +class UsesSelf(T:! type) { // Equivalent to `fn Make() -> UsesSelf(T)*;` fn Make() -> Self*; impl as SelfIface; @@ -557,21 +564,21 @@ let y: UsesSelf(i32) = x->Get(); None of this is new in this proposal. This proposal adds the rule: -If a constraint `C` into which lookup is performed includes a `where` clause +If a facet type `C` into which lookup is performed includes a `where` clause saying `.N = U`, and the result of qualified name lookup is the associated constant `N`, that result is replaced by `U`, and the type of the result is the type of `U`. No substitution is performed in this step, not even a `Self` substitution -- any necessary substitutions were already performed when forming -the constraint `C`, and we don’t consider either the declared type or value of -the associated constraint at all for this kind of constraint. Going through an +the facet type `C`, and we don’t consider either the declared type or value of +the associated constant at all for this kind of constraint. Going through an example in detail: ``` interface A { - let T:! Type; + let T:! type; } interface B { - let U:! Type; + let U:! type; // More explicitly, this is of type `A where .(A.T) = Self.(B.U)` let V:! A where .T = U; } @@ -585,7 +592,7 @@ fn F[T:! B](x: T) { // The type of `u` is `A where .(A.T) = T.(B.U)`. // Lookup for `u.T` resolves it to `u.(A.T)`. // So the result of the qualified member access is `T.(B.U)`, - // and the type of `v` is the type of `T.(B.U)`, namely `Type`. + // and the type of `v` is the type of `T.(B.U)`, namely `type`. // No substitution is performed in this step. let v:! auto = u.T; } @@ -628,7 +635,7 @@ generic, and do not do it again: ``` interface IfaceHasX { - let X:! Type; + let X:! type; } class ClassHasX { class X {} @@ -656,7 +663,7 @@ value. It’s important that we perform this resolution: ``` interface A { - let T:! Type; + let T:! type; } class K { fn Member(); } fn H[U:! A](x: U) -> U.T; @@ -682,9 +689,9 @@ converting `V` to the type `A` of `U` discards the rewrite constraint. In order to apply rewrites during substitution, we associate a set of rewrites with each value produced by the recursive rebuilding procedure. This is somewhat -like having substitution track a refined type-of-type for each value, but we -don’t need -- or want -- for the type to change during this process, only for -the rewrites to be properly applied. For a substituted binding, this set of +like having substitution track a refined facet type for the type of each value, +but we don’t need -- or want -- for the type to change during this process, only +for the rewrites to be properly applied. For a substituted binding, this set of rewrites is the rewrites found on the type of the corresponding value prior to conversion to the type of the binding. When rebuilding a member access expression, if we have a rewrite corresponding to the accessed member, then the @@ -699,10 +706,10 @@ Continuing an example from [qualified name lookup](#qualified-name-lookup): ``` interface A { - let T:! Type; + let T:! type; } interface B { - let U:! Type; + let U:! type; let V:! A where .T = U; } @@ -718,7 +725,7 @@ fn F2[T:! B where .U = i32](x: T) { // The type of `u` is `A where .(A.T) = i32`. // Lookup for `u.T` resolves it to `u.(A.T)`. // So the result of the qualified member access is `i32`, - // and the type of `v` is the type of `i32`, namely `Type`. + // and the type of `v` is the type of `i32`, namely `type`. // No substitution is performed in this step. let v:! auto = u.T; } @@ -728,13 +735,13 @@ fn F2[T:! B where .U = i32](x: T) { ``` interface Container { - let Element:! Type; + let Element:! type; } interface SliceableContainer { extends Container; let Slice:! Container where .Element = Self.(Container.Element); } -// ❌ Qualified name lookup rewrites this constraint to +// ❌ Qualified name lookup rewrites this facet type to // `SliceableContainer where .(Container.Element) = .Self.(Container.Element)`. // Constraint resolution rejects this because this rewrite forms a cycle. fn Bad[T:! SliceableContainer where .Element = .Slice.Element](x: T.Element) {} @@ -742,10 +749,10 @@ fn Bad[T:! SliceableContainer where .Element = .Slice.Element](x: T.Element) {} ``` interface Helper { - let D:! Type; + let D:! type; } interface Example { - let B:! Type; + let B:! type; let C:! Helper where .D = B; } // ✅ `where .D = ...` by itself is fine. @@ -835,7 +842,7 @@ expressions are treated as distinct types when type-checking a generic that refers to them. Same-type constraints are brought into scope, looked up, and resolved exactly as -if there were a `SameAs(U:! Type)` interface and a `T == U` impl corresponded to +if there were a `SameAs(U:! type)` interface and a `T == U` impl corresponded to `T is SameAs(U)`, except that `==` is commutative. As such, it's not possible to ask for a list of types that are the same as a given type, nor to ask whether there exists a type that is the same as a given type and has some property. But @@ -847,21 +854,21 @@ accomplished by the use of `==` constraints in an `impl`, such as in the built-in implementation of `ImplicitAs`: ``` -final impl forall [T:! Type, U:! Type where .Self == T] T as ImplicitAs(U) { +final impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) { fn Convert[me: Self](other: U) -> U { ... } } ``` It superficially seems like it would be convenient if such implementations were made available implicitly – for example, by writing -`impl forall [T:! Type] T as ImplicitAs(T)` – but in more complex examples that +`impl forall [T:! type] T as ImplicitAs(T)` – but in more complex examples that turns out to be problematic. Consider: ``` -interface CommonTypeWith(U:! Type) { - let Result:! Type; +interface CommonTypeWith(U:! type) { + let Result:! type; } -final impl forall [T:! Type] T as CommonTypeWith(T) where .Result = T {} +final impl forall [T:! type] T as CommonTypeWith(T) where .Result = T {} fn F[T:! Potato, U:! Hashable where .Self == T](x: T, y: U) -> auto { // What is T.CommonTypeWith(U).Result? Is it T or U? @@ -882,17 +889,17 @@ between `C where .A = X` and `C where .A == X`: ``` interface EqualConverter { - let T:! Type; + let T:! type; fn Convert(t: T) -> Self; } -fn EqualConvert[T:! Type](t: T, U:! EqualConverter where .T = T) -> U { +fn EqualConvert[T:! type](t: T, U:! EqualConverter where .T = T) -> U { return U.Convert(t); } -impl forall [U:! Type] U as EqualConverter where .T = U { +impl forall [U:! type] U as EqualConverter where .T = U { fn Convert(u: U) -> U { return u; } } -impl forall [T:! Type, U:! Type where .Self == T] T as ImplicitAs(U) { +impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) { fn Convert[self: Self]() -> U { return EqualConvert(self, U); } } ``` @@ -1127,8 +1134,8 @@ before the associated constant is used. For example: ``` interface C { - let T:! Type; - let U:! Type; + let T:! type; + let U:! type; } // Could reject, `.U` referenced before rewrite is defined. fn G[A:! C where .T = .U and .U = i32]() {} @@ -1171,8 +1178,8 @@ Specifically, given ``` interface C { - let T:! Type; - let U:! Type; + let T:! type; + let U:! type; } ``` @@ -1203,7 +1210,7 @@ constraint in an interface or named constraint: ``` interface A { - let T:! Type; + let T:! type; } constraint AInt { extends A where .T = i32; @@ -1263,7 +1270,7 @@ argument. If rewrites from `impl T as` constraints were used, it would be possible to form a rewrite cycle: ``` -interface C { let X:! Type; } +interface C { let X:! type; } interface A { let U:! C; let T:! C; From b1b0b193de600ee5a1f76ffca8eb270fd59eb72d Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 3 Jan 2023 16:18:30 -0800 Subject: [PATCH 17/19] Mention early on that we're restricting the syntax of `where A = B`, not only changing the semantics. --- proposals/p2173.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index dedba16f09052..c3a80545ff331 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -206,12 +206,13 @@ There has been a lot of work in this space. Some previous relevant proposals: ## Proposal Replace the semantics of the existing `where A == B` and `where A = B` -constraints as follows: +constraints and restrict the syntax of `where A = B` as follows: -- `where .A = T` specifies a _rewrite_ constraint. If the associated constant - `.A` is named as a member of a type with this constraint, it is rewritten to - `T`. This means that the interface of the type changes, and the behavior is - in all respects as if `T` had been specified directly. +- `where .A = T` specifies a _rewrite_ constraint. The left-hand side is + required to name an associated constant. If that associated constant `.A` is + named as a member of a type declared with this constraint, it is rewritten + to `T`. This means that the interface of the type changes, and the behavior + is in all respects as if `T` had been specified directly. - `where X == Y` specifies a _same-type_ constraint. This is treated much like `where X is SameAs(Y)`. Here, `SameAs` is a hypothetical constraint that is reflexive (`T is SameAs(T)` for all `T`), commutative (`T is SameAs(U)` if From 33fa914b214f46ae62b01b478de87de804452893 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 3 Jan 2023 16:54:41 -0800 Subject: [PATCH 18/19] Disallow top-level rewrites in `impl as` and also in `is` constraints. --- proposals/p2173.md | 88 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 72 insertions(+), 16 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index c3a80545ff331..d69eaf94567c5 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -25,7 +25,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Combining constraints with `&`](#combining-constraints-with-) - [Combining constraints with `and`](#combining-constraints-with-and) - [Combining constraints with `extends`](#combining-constraints-with-extends) - - [Combining constraints with `impl as`](#combining-constraints-with-impl-as) + - [Combining constraints with `impl as` and `is`](#combining-constraints-with-impl-as-and-is) - [Constraint resolution](#constraint-resolution) - [Precise rules and termination](#precise-rules-and-termination) - [Qualified name lookup](#qualified-name-lookup) @@ -367,12 +367,14 @@ var n: i32; fn F(T:! B) -> T.(A.T) { return n; } ``` -### Combining constraints with `impl as` +### Combining constraints with `impl as` and `is` -Within an interface or named constraint, the `impl T as C` syntax permits `=` -constraints. However, because these constraints don't change the type of `T`, -rewrite constraints will never result in a rewrite being performed, and instead -are equivalent to `==` constraints. +Within an interface or named constraint, the `impl T as C` syntax does not +permit `=` constraints to be specified directly. However, such constraints can +be specified indirectly, for example if `C` is a named constraint that contains +rewrites. Because these constraints don't change the type of `T`, rewrite +constraints in this context will never result in a rewrite being performed, and +instead are equivalent to `==` constraints. ``` interface A { @@ -386,10 +388,9 @@ constraint D { extends A where .T == .U and .U == i32; } interface B { - // This constraint is equivalent to - // `impl as A where .T == .U and .U == i32;` or - // `impl as C;` or `impl as D;`. - impl as A where .T = .U and .U = i32; + // OK, equivalent to `impl as D;` or + // `impl as A where .T == .U and .U == i32;`. + impl as C; } var n: i32; @@ -403,6 +404,52 @@ var n: i32; fn F(T:! B) -> T.(A.T) { return n; } ``` +A purely syntactic check is used to determine if an `=` is specified directly in +an expression: + +- An `=` constraint is specified directly in its enclosing `where` expression. +- If an `=` constraint is specified directly in an operand of an `&` or + `(`...`)`, then it is specified directly in that enclosing expression. + +In an `impl as C` or `impl T as C` declaration in an interface or named +constraint, `C` is not allowed to directly specify any `=` constraints: + +``` +// Compile-time identity function. +fn Identity[T:! type](x:! T) -> T { return x; } + +interface E { + // ❌ Rewrite constraint specified directly. + impl as A where .T = .U and .U = i32; + // ❌ Rewrite constraint specified directly. + impl as type & (A where .T = .U and .U = i32); + // ✅ Not specified directly, but does not result + // in any rewrites being performed. + impl as Identity(A where .T = .U and .U = i32); +} +``` + +The same rules apply to `is` constraints. Note that `.T == U` constraints are +also not allowed in this context, because the reference to `.T` is rewritten to +`.Self.T`, and `.Self` is ambiguous. + +``` +// ❌ Rewrite constraint specified directly in `is`. +fn F[T:! A where .U is (A where .T = i32)](); + +// ❌ Reference to `.U` in same-type constraint is ambiguous: +// does this mean the outer or inner `.Self.U`? +fn G[T:! A where .U is (A where .T == i32)](); + +// ✅ Not specified directly, but does not result +// in any rewrites being performed. Return type +// is not rewritten to `i32`. +fn H[T:! type where T is C]() -> T.(A.U); + +// ✅ Return type is rewritten to `i32`. +fn I[T:! C]() -> T.(A.U); +``` + ### Constraint resolution When a facet type is used as the declared type of a type `T`, the constraints @@ -1206,8 +1253,8 @@ The same rule applies to `is` constraints: we accept ### Rewrite constraints in `impl as` constraints -A rewrite constraint can appear, directly or indirectly, in an `impl as` -constraint in an interface or named constraint: +A rewrite constraint can appear indirectly in an `impl as` constraint in an +interface or named constraint: ``` interface A { @@ -1248,15 +1295,18 @@ fn G(T:! C) -> T.T; ``` Something similar happens with `impl T as` constraints. The following two -constructs are not equivalent: +interfaces `A1` and `A2` are not equivalent: ``` -interface A { +interface A1 { let X:! Add where .Result = i32; } -interface A { +constraint AddToi32 { + extends Add where .Result = i32; +} +interface A2 { let X:! Add; - impl X as Add where .Result = i32; + impl X as AddToi32; } ``` @@ -1297,3 +1347,9 @@ from an `impl T as` constraint, so this proposal does not adopt that behavior. We could allow `impl T as` to introduce rewrites for `T` in general, but we would need to find some way to fix the resulting holes in the termination argument. + +To minimize the scope of confusion, we currently disallow rewrites from +appearing _syntactically_ within an `impl as` or `impl T as` constraint. We +could allow these constructs. However, a `.T = V` constraint would be equivalent +to a `.T == V` constraint, and the latter more clearly expresses the resulting +semantics, so disallowing the misleading form seems preferable. From f853310c6e392d3bccaebed6510a314861bfee0f Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 31 Jan 2023 14:27:17 -0800 Subject: [PATCH 19/19] Address review comments. --- proposals/p2173.md | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/proposals/p2173.md b/proposals/p2173.md index d69eaf94567c5..f8f780a8be8a3 100644 --- a/proposals/p2173.md +++ b/proposals/p2173.md @@ -437,14 +437,14 @@ also not allowed in this context, because the reference to `.T` is rewritten to // ❌ Rewrite constraint specified directly in `is`. fn F[T:! A where .U is (A where .T = i32)](); -// ❌ Reference to `.U` in same-type constraint is ambiguous: -// does this mean the outer or inner `.Self.U`? +// ❌ Reference to `.T` in same-type constraint is ambiguous: +// does this mean the outer or inner `.Self.T`? fn G[T:! A where .U is (A where .T == i32)](); // ✅ Not specified directly, but does not result // in any rewrites being performed. Return type // is not rewritten to `i32`. -fn H[T:! type where T is C]() -> T.(A.U); +fn H[T:! type where .Self is C]() -> T.(A.U); // ✅ Return type is rewritten to `i32`. fn I[T:! C]() -> T.(A.U); @@ -458,7 +458,10 @@ constraints that apply to `T`. This happens: - When the constraint is used explicitly, when declaring a generic parameter or associated constant of the form `T:! Constraint`. -- When declaring an `impl` such as `impl T as Constraint`. +- When declaring that a type implements a constraint with an `impl` + declaration, such as `impl T as Constraint`. Note that this does not include + `impl ... as` constraints appearing in `interface` or `constraint` + declarations. In each case, the following steps are performed to resolve the facet type's abstract constraints into a set of constraints on `T`: @@ -874,7 +877,7 @@ Each of the above steps performs at most one rewrite, and doesn't introduce any new recursive type-checking steps, so should not introduce any new forms of non-termination. Rewrite constraints thereby give us a deterministic, terminating type canonicalization mechanism for associated constants: in `A.B`, -if the type of `A` specifies that `.B == C`, then `A.B` is replaced by `C`. +if the type of `A` specifies that `.B = C`, then `A.B` is replaced by `C`. Equality of types constrained in this way is transitive. However, some existing forms of non-termination may remain, such as template @@ -1088,8 +1091,10 @@ the approach taken by some other modern languages: rule cannot be readily understood by a developer nor effectively worked around. - Rust also has an - [undecidable type system (content warning: swear words)](https://sdleffler.github.io/RustTypeSystemTuringComplete/) - for the same reason, and similarly has a recursion limit. + [undecidable type system](https://sdleffler.github.io/RustTypeSystemTuringComplete/) + (content warning: contains many instances of an obscene word as part of a + programming language name) for the same reason, and similarly has a + recursion limit. See also [Typing is Hard](https://3fx.ch/typing-is-hard.html), which lists similar information for a variety of other languages. Note that most languages