Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

struct vs. class #651

Closed
josh11b opened this issue Jul 14, 2021 · 12 comments
Closed

struct vs. class #651

josh11b opened this issue Jul 14, 2021 · 12 comments
Labels
leads question A question for the leads team

Comments

@josh11b
Copy link
Contributor

josh11b commented Jul 14, 2021

We have agreed that we want a single introducer keyword for record types. There are two choices on the table: struct and class.

  • struct better reflects our intended default access control of public for all members
  • class is a less awkward word to say and better reflects that it supports encapsulation, inheritance, etc.

Note that there is a precedent in a few languages to use the phrase "data class" (Kotlin, Python) in cases that match traditional C structs with just public data members.

@geoffromer
Copy link
Contributor

Elaborating a bit on the rationale for class: for a lot of C++ programmers, I think it will be a pretty big adjustment to stop talking about "classes", probably much more so than to stop talking about "structs". Furthermore, I feel like "class" gives us better options for replacing the missing vocabulary: "data class" feels pretty natural and self-explanatory, whereas something like "encapsulated struct" feels much less so.

@josh11b
Copy link
Contributor Author

josh11b commented Jul 15, 2021

Marked this issue as blocking due to struct proposal #561 and issue #653 .

@zygoloid
Copy link
Contributor

I think we have consensus that we want to use class for declarations of named structure types. #561 also has anonymous struct types (which are type expressions rather than type declarations) and struct literals; what would we call those and how would we write them? There are a few options here:

  1. These are "anonymous classes". The type class { .x: Int, .y: Int } is an "anonymous class type", the value {.x = 3, .y = 4} is an "anonymous class literal".
  2. These are "anonymous structs" (even though there is no other kind of struct). The type struct { .x: Int, .y: Int } is an "anonymous struct type", the value {.x = 3, .y = 4} is an "anonymous struct literal".
  3. These are simply "structs". The type struct { .x: Int, .y: Int } is a "struct type", the value {.x = 3, .y = 4} is a "struct literal".

The first option seems like the most obvious one given the direction in this issue. However, given the differences we appear to want between the named form and the anonymous form (nominal versus structural typing, base classes and type hierarchies versus no inheritance, exact-match semantics for assignment versus field reordering, var field: Type; syntax versus field: Type, syntax, ...) I do wonder if calling them both "class" is more confusing than helpful. So I'm somewhat leaning towards option 3: we have (named, always) classes (that support inheritance and access control and members and such), and (anonymous, always) structs (that use a structural typing rule and represent a simple mapping from field names to values).

@chandlerc
Copy link
Contributor

I'm ok with this direction... It also seems super low cost to change if we do discover some reason to want to call these anonymous classes.

One minor terminology question -- are these struct types (a particular kind of) classes?

If not, what is the word for the union of anonymous struct types and classes?

@zygoloid
Copy link
Contributor

One minor terminology question -- are these struct types (a particular kind of) classes?

If not, what is the word for the union of anonymous struct types and classes?

I think the clearest stance will probably be that struct types are not class types and class types are not struct types, given that there seem to be quite a lot of differences between the two. Perhaps "record type" is a good name for a struct or class type?

@chandlerc
Copy link
Contributor

One minor terminology question -- are these struct types (a particular kind of) classes?
If not, what is the word for the union of anonymous struct types and classes?

I think the clearest stance will probably be that struct types are not class types and class types are not struct types, given that there seem to be quite a lot of differences between the two. Perhaps "record type" is a good name for a struct or class type?

That has not proven intuitive for C++ users...

Are the type differences observable here? While not all classes have the properties of anonymous structs, those properties do seem to be available for classes.

I'd be a bit more comfortable if this was purely to distinguish the (quite different) syntax, but at the end of the day we don't have two different kinds of types.

@chandlerc
Copy link
Contributor

I wrote this up in Discord, but it occurs to me I probably should put it somewhere more easily found so copying here....

As a concrete way to see what I'm trying to get at in my prior comment around what type differences are observable, let's look at an example:

fn F[template T:! Type](x: T);

var anon: struct {.a: Int, .b: Int} = {.a = 1, .b = 2};
class X final { var a: Int; var b: Int; }
class X2 final { var a: Int; var b: Int; }
var named: X = {.a = 1, .b = 2};
var named2: X2 = {.a = 1, .b = 2};
F(anon);
F(named);
F(named2);

I wouldn't expect F(anon) above to have its T differ along more dimensions from F(named) than F(named) is from F(named2).

So, anonymous types are separate types from nominal ones, and distinguished through structural matching, yes. But once you have a particular type, I would expect it to behave like a (data) class type.

If that isn't the case, I'd like to understand why... because I think differences here will be much harder to start as an anonymous type and turn it into a nominal type. And I would expect that to be a reasonably common thing to do during evolution of an API. So from a behavioral and capability perspective, it seems really desirable for anonymous structs to be a refinement of named class types.

None of this is arguing that we should try to make the anonymous struct expression (struct {.a: Int, .b: Int} above) a refinement of a class declaration -- totally happy with those just being disjoint things as they really serve totally different use cases etc. Even happy with them having different introducers given the sharp differences elsewhere in their syntax.

@zygoloid
Copy link
Contributor

If that isn't the case, I'd like to understand why... because I think differences here will be much harder to start as an anonymous type and turn it into a nominal type.

I would expect anonymous struct types to have strictly fewer capabilities than full-fledged class types, both in terms of what they themselves can do (such as having private members or base classes) and in terms of what client code can do with them (such as deriving from them or implementing someone else's interface for them).

I agree that it's important that anonymous struct types don't meaningfully have capabilities that named classes lack, precisely because this would make it hard to convert an anonymous struct to a named class.

Ideally I think the nomenclature question should be based around what's most useful for Carbon developers' day-to-day needs. My guess would be that people will want different words for a type that represents an abstraction and provides encapsulation, invariants, and an interface of member functions and for a type that is a collection of named fields with no further constraints, but I think they'll also want a word for a type that could be either (or could be some other kind of type that provides named field access). I'm not sure though -- maybe the more common need is for a word that refers to a type that could be either anonymous struct or named class. How do we find out?

@chandlerc
Copy link
Contributor

If that isn't the case, I'd like to understand why... because I think differences here will be much harder to start as an anonymous type and turn it into a nominal type.

I would expect anonymous struct types to have strictly fewer capabilities than full-fledged class types, both in terms of what they themselves can do (such as having private members or base classes) and in terms of what client code can do with them (such as deriving from them or implementing someone else's interface for them).

I agree that it's important that anonymous struct types don't meaningfully have capabilities that named classes lack, precisely because this would make it hard to convert an anonymous struct to a named class.

Yep.

Ideally I think the nomenclature question should be based around what's most useful for Carbon developers' day-to-day needs. My guess would be that people will want different words for a type that represents an abstraction and provides encapsulation, invariants, and an interface of member functions and for a type that is a collection of named fields with no further constraints, but I think they'll also want a word for a type that could be either (or could be some other kind of type that provides named field access). I'm not sure though -- maybe the more common need is for a word that refers to a type that could be either anonymous struct or named class. How do we find out?

I think we have to make an educated guess until we have users. But I think our best strategy for that guess is to follow C++ here as closely as we can without diverging from other languages too much.

My impression is that in C++ the most common case is the last one in your list -- a type that could be either an anonymous struct or a named class. This is what std::is_class identifies (despite its oddities around unions): https://compiler-explorer.com/z/6arh99n9G. This seems to match other languages as well

So my guess is that the least confusing thing is to call the entire space here "class types" and then develop vocabulary for the refinements.

I think using "class type" as encompassing all of these also results in pretty good hits in general documentation (Wikipedia "Class") where "class type" seems to be a reasonable comprehensive thing. On the flip side, the documentation I find about "record type" (Wikipedia "Record") wouldn't obviously apply to the combination of either anonymous structs or named class.

@josh11b
Copy link
Contributor Author

josh11b commented Jul 30, 2021

For the terminology point, the thing I've heard, which might be consensus:

We could call these "structural data class literals" and "structural data classes" but "struct literals" and "struct types" for short. This is in contrast to "nominal data classes" which act the same way for purposes of implementing interfaces, but are declared with a different syntax and use nominal type equality.

@chandlerc
Copy link
Contributor

Just to be explicit, I'm very happy with the outcome here, both from a terminology perspective and that it settles us on class. I've also chatted about this with both the other leads and so I think we have consensus around this.

@chandlerc
Copy link
Contributor

(and actually closing since we have consensus here)

josh11b added a commit that referenced this issue Aug 9, 2021
…ork (#561)

This proposal defines the very basics of `class` types, primarily focused on:

-   use cases including: data classes, encapsulated types, inheritance with and without `virtual`, interfaces as base classes, and mixins for code reuse;
-   anonymous data types for called _structural data classes_ or _struct types_. Struct literals are used to initialize class values and ad-hoc parameter and return types with named components; and
-   future work, including the provisional syntax already in use for features that have not been decided.

The intent is to both make some small incremental progress and get agreement on direction. As such it doesn't include things like nominal types, methods, access control, inheritance, etc.

It proposes this struct type and literal syntax:
```
var p: {.x: Int, .y: Int} = {.x = 0, .y = 1};
```
Note that it uses commas (`,`) between fields instead of semicolons (`;`), and no introducer for types or literal values.

Incorporates decisions from #665 , #653 , #651


Co-authored-by: Geoff Romer <[email protected]>
Co-authored-by: Chandler Carruth <[email protected]>
jonmeow added a commit that referenced this issue Aug 19, 2021
chandlerc added a commit that referenced this issue Jun 28, 2022
…ork (#561)

This proposal defines the very basics of `class` types, primarily focused on:

-   use cases including: data classes, encapsulated types, inheritance with and without `virtual`, interfaces as base classes, and mixins for code reuse;
-   anonymous data types for called _structural data classes_ or _struct types_. Struct literals are used to initialize class values and ad-hoc parameter and return types with named components; and
-   future work, including the provisional syntax already in use for features that have not been decided.

The intent is to both make some small incremental progress and get agreement on direction. As such it doesn't include things like nominal types, methods, access control, inheritance, etc.

It proposes this struct type and literal syntax:
```
var p: {.x: Int, .y: Int} = {.x = 0, .y = 1};
```
Note that it uses commas (`,`) between fields instead of semicolons (`;`), and no introducer for types or literal values.

Incorporates decisions from #665 , #653 , #651


Co-authored-by: Geoff Romer <[email protected]>
Co-authored-by: Chandler Carruth <[email protected]>
chandlerc pushed a commit that referenced this issue Jun 28, 2022
@jonmeow jonmeow added the leads question A question for the leads team label Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leads question A question for the leads team
Projects
None yet
Development

No branches or pull requests

5 participants