Carbon Copy Newsletter No.3 #4068

wolffg · 2024-06-20T21:03:54Z

wolffg
Jun 20, 2024
Maintainer

Carbon Copy, June 2024

Here is the new Carbon Copy, your periodic update on the Carbon language!

Carbon Copy is designed for those people who want a high-level view of what's happening on the project. If you'd like to subscribe, you can join [email protected]. Carbon Copy should arrive roughly every other month.

Toolchain progress

We've made significant progress on our toolchain, particularly in these areas:

import
extern
generic types
fast hash table
linking

As a test, we can now compile and execute a program like this prime sieve example, if we restrict ourselves to language features that are already implemented in the toolchain:

class Sieve {
  fn Make() -> Sieve {
    returned var s: Sieve;

    var n: i32 = 0;
    while (n < 1000) {
      s.is_prime[n] = true;
      ++n;
    }

    return var;
  }

  fn MarkMultiplesNotPrime[addr self: Self*](p: i32) {
    var n: i32 = 2 * p;
    while (n < 1000) {
      self->is_prime[n] = false;
      n += p;
    }
  }

  var is_prime: [bool; 1000];
}

fn Run() -> i32 {
  var s: Sieve = Sieve.Make();

  var number_of_primes: i32 = 0;
  var n: i32 = 2;
  while (n < 1000) {
    if (s.is_prime[n]) {
      ++number_of_primes;
      s.MarkMultiplesNotPrime(n);
    }
    ++n;
  }

  return number_of_primes;
}

This produces output:

$ bazel-bin/examples/sieve
$ echo $?
168

...however, the toolchain is still in early development. It is not yet tested or documented, and is guaranteed not to work on many platforms. Please check back in future newsletters for more progress updates!

Carbon at Conferences

Recently

April 11, 2024: "Carbon: An experiment in different tradeoffs" panel session at EuroLLVM 2024
- Alex Bradbury's notes on the session
May 2, 2024: "Generic Arity: Definition-Checked Variadics in Carbon" at C++Now

Upcoming

July 21-24, 2024: "How designing Carbon with C++ interop taught me about C++ variadics and overloads", CppNorth
Sept. 11, 2024: "The Carbon Language: Road to 0.1", NDC {TechTown}

Recent proposals

In progress since last newsletter, now approved & merged:

SemIR fidelity when representing rewrite semantics #3833
Matching redeclarations #3763
Raw identifier syntax #3797
Member binding operators #3720

New since last newsletter, now approved & merged:

Exporting imported names #3938
More consistent package syntax #3927

New, in progress:

extend api #3802
Lambdas #3848
Singular extern declarations #3980

Spotlight: Expression categories

Expressions are the portions of Carbon syntax that produce values. There are three different categories of expressions:

value expressions, that are read-only and cannot have their address taken
reference expressions, that can be read or written or have their address taken, and
initializing expressions, that initialize objects.

These three represent a small number of composable concepts that allows a simpler, faster compiler with efficient calling conventions (even in situations where types are generic). Hopefully, too, an abbreviated set of expression categories will also be easy to learn and apply.

Let's take a look at all three.

Value expressions

Value expressions produce abstract values that cannot be modified or have their address taken. They can be formed in two ways: a literal expression like 42, or by reading the value of some stored object.

// A simple example: `15 + 3` is a value expression
let a_value: i32 = 15 + 3;
// `a_value` is a value expression.
Print("a_value is {0}\n", a_value);

// Define a class
class AnObjectClass {

  // `ChangeMe` is a mutating method, and needs an object with an address.
  fn ChangeMe[addr self: Self*]() {
    self->counter += 1;
  }

  // `ReadMe` takes its `self` parameter by value.
  fn ReadMe[self: Self]() -> i32 {
    return self.counter;
  }

  var counter: i32 = 0;
}

// `AddSomething` cannot affect any of its arguments.
fn AddSomething(an_int: i32, some_value: AnObjectClass) -> i32 {
  // Both `an_int` and `some_value` are values.

  // ❌ Invalid, can't mutate or take address of a value:
  //    some_value.ChangeMe();

  // ✅ Allowed:
  return an_int + some_value.ReadMe();
}

Values allow efficient passing of arguments into functions. They give the compiler a lot of flexibility to use an efficient calling convention since the values may not be modified and do not have addresses. They provide a single model that can get both the efficiency of passing by copy for small types (such as those that fit into one or two machine registers) and also the efficiency of minimal copies when working with types where copies are not viable or have significant costs.

The compiler typically uses one of three representation approaches for values:

copying, appropriate for small objects, like integers
const pointer, appropriate for large objects or objects that can't be safely copied, because they have a type that is a base class or are dynamically-sized
customized, appropriate for containers, where a custom type is used to get a read-only view of the contents

"Customized" is the hardest to imagine; think of a C++ std::string_view that is a read-only view of the contents of a string, or std::span that gives a read-only view of a container.

By using different value representations for different types, we get efficiency even in generic code. In C++, you might use a const reference for a parameter with a templated type, because it works with every type, even though that isn't the most efficient choice for integer arguments. In Carbon, passing by value uses a calling convention appropriate to the type. (Footnote: For more, you can read Foonathan's post on the prospects for efficiency.)

Reference expressions

Reference expressions refer to objects with storage where a value may be read or written and the object's address can be taken. These are analogous to C++'s lvalues or Rust's place expressions.

var an_int: i32 = 2; 
var some_object: AnObjectClass = {};

// `an_int` and `some_object` are durable reference expressions
an_int += 3;
some_object.ChangeMe();

// Reference expressions have addresses.
var thing: AnObjectClass* = &some_object;

Most reference expressions are durable, which means the object's storage outlives the full expression and the address could be meaningfully propagated out of it as well.

Initializing expressions

Initializing expressions do what they sound like; they initialize an object. They require storage to be provided implicitly when evaluating the expression.

Function calls in Carbon are modeled directly as initializing expressions; they require storage as an input and when evaluated cause that storage to be initialized with an object. Consider when a function call is used to initialize some variable pattern:

fn CreateNewObject() -> AnObjectClass {
  return <return-expression>;
}

var x: AnObjectClass = CreateNewObject();

The return value is constructed directly into the caller's storage. This enables Carbon to guarantee that returned values are not copied (except for when the type is something like an integer that enables returning in a register) either:

when the value is constructed as part of the return statement ("return value optimization" or RVO), or
when using returned var , and return var ("named RVO" or NRVO, docs: overview, details), as in this example:

fn MakeCircle(radius: i32) -> Circle {
  returned var c: Circle;
  c.radius = radius;
  //`return c` would be invalid because `returned` is in use.
  return var;
}

This means we didn't have to have separate constructor functions for Carbon types with their own syntax and rules. Instead, ordinary functions may be used, even for types that can't be copied.

This also means you guarantee when the compiler uses NRVO, rather than depending on "named return value optimization" of C++.

If you want a function parameter to have its own storage so the function has its own copy of the value it can modify, you can add a var parameter. A var parameter is initialized with an initializing expression, rather than a value expression. Inside the function, the parameter is a reference expression.

def MyCoolFn(var an_object: AnObjectClass) {
  // We are allowed to mutate `an_object`
  an_object.ChangeMe();

  // Can take the address too.
  var pointer: AnObjectClass* = &an_object;
  MakesChanges(pointer);

  // `an_object` is a reference expression, and
  // so may be assigned to.
  an_object = CreateNewObject();
  *pointer = CreateNewObject();

  // However, `an_object` is a copy, so none of these
  // changes affect the caller.
}

// CreateNewObject() is an initializing expression that is used
// to initialize the `an_object` parameter of `MyCoolFn`.	
MyCoolFn(CreateNewObject());

var caller_object: AnObjectClass = {};
// `caller_object` is a reference expression that is converted
// into an initializing expression to initialize the `an_object`
// parameter of `MyCoolFn`.
MyCoolFn(caller_object);
// caller_object remains unchanged

Conversions between categories

Expressions in one category can be converted to any other category when needed.

Value binding forms a value expression from the current value of the object referenced by a reference expression.

var v: AnObjectClass = {};
// `v` is a reference expression

// Value binding: `v` is converted to a value to call `AddSomething` as defined
// above
AddSomething(2, v);

Direct initialization converts a value expression into an initializing expression. This implicitly converts the value to the type of the object being initialized.

// Direct initialization: the value expression `42` becomes an 
// initializing expression
var k: i32 = 42;

This implicit conversion uses the ImplicitAs operator (see proposal & PR), which is a regular function that takes a value and returns an initializing expression.

Copy initialization converts a reference expression into an initializing expression.

// Copy initialization: converting a reference expression into an
// initializing expression
var q: AnObjectClass = v;
// `MyCoolFn` takes an initializing expression since its parameter
// is marked `var`, so this also uses copy initialization.
MyCoolFn(v);

This involves a copy from the storage of the source to the destination.

Temporary materialization converts an initializing expression into a reference expression by providing temporary storage to initialize.

// The initializing expression returned by `CreateNewObject()` is 
// materialized as a temporary that only lives during this statement.
CreateNewObject().ChangeMe();

Temporary materializations produce ephemeral (rather than durable) reference expressions. They still refer to an object with storage, but it may be storage that will not outlive the full expression. Beyond this distinction, there is no restriction on how they can be used.

These conversion steps combine to provide the transitive conversion table:

From:	value	reference	initializing
to value	==	bind	materialize + bind
to reference	direct init + materialize	==	materialize
to initializing	direct init	copy init	==

Conclusion

We've introduced many core concepts here. However, to write code in Carbon, you really only need to remember two basic rules:

If you use var, it is a reference expression, and you have permission to mutate the object. Every var has its own storage, so mutations don't affect other variables.
If you don't see var, it is a value expression and is immutable, and that allows optimizations.

Functions return initializing expressions that are converted to one of those two cases.

Learn more

There's more to read! See:

We also strongly recommend Jonathan Müller's blog post entitled "Carbon’s most exciting feature is its calling convention" as a deep dive into what efficiencies can be had with Carbon's function parameters.

Other notes

If you want more current discussion, check out the weekly meeting notes from the Carbon Weekly Sync.

Wrap-up

Don't forget to subscribe! You can join [email protected]. If you have comments or would like to contribute to future editions of Carbon Copy, please reach out. And, always, join us any way you can!

Carbonically yours,

Josh, Wolff, and the Carbon team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Carbon Copy Newsletter No.3 #4068

{{title}}

Replies: 0 comments

Select a reply

Carbon Copy Newsletter No.3 #4068

wolffg Jun 20, 2024 Maintainer

Carbon Copy, June 2024

Toolchain progress

Carbon at Conferences

Recent proposals

Spotlight: Expression categories

Value expressions

Reference expressions

Initializing expressions

Conversions between categories

Conclusion

Learn more

Other notes

Wrap-up

Replies: 0 comments

wolffg
Jun 20, 2024
Maintainer