You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is the new Carbon Copy, your periodic update on the Carbon language!
Carbon Copy is designed for those people who want a high-level view of what's happening on the project. If you'd like to subscribe, you can join [email protected]. Carbon Copy should arrive roughly every other month.
Spotlight: Unformed state
In today's spotlight, let's talk about a common problem: operating on uninitialized variables.
Where might you encounter uninitialized values?
A variable that is declared before it is initialized, such as
before calling a function that initializes an object via a pointer
an object that is initialized and used on some code paths but not others
After moving from an object
In Carbon, safety and performance are both priorities. For safety, it should be difficult or impossible to operate on an uninitialized variable. When performance is critical, however, you don't want to pay any penalty for this safety.
A solution some languages use is to automatically initialize or require initializers. For Carbon, there are concerns with this approach, such as:
Extra cost: Ideally, code should not access uninitialized variables, so why pay for initializing them?
Still could be a bug: Having consistent initialization (to 0 or, say, Solaris's 0xDEADBEEF) may prevent undefined behavior, but it still may be unintended.
Tools can't help: Tools like the compiler and sanitizers can't tell whether you meant to use the initialized value or whether you're doing it accidentally.
Another approach is definitive initialization, which means statically proving that the variable is initialized on every control flow path. This has disadvantages, including being sensitive to the compiler's control-flow analysis---if the compiler implementation changes, it could disallow existing code unexpectedly. Also, certain refactorings like moving code into functions either would be forbidden or require much more aggressive and expensive control-flow analysis.
Recently, C++ introduced P2795R5, which indicates that reading an uninitialized value is now erroneous behavior, a new C++ concept distinct from undefined behavior. This means implementations can provide diagnosable errors in response. If such a usage is not diagnosed you will get the defined behavior of returning an implementation-defined value.
Carbon's solution is unformed state. Unformed state is the state of an object before it is initialized. While unformed, only a limited set of operations are allowed. Disallowed operations on an unformed object can usually be diagnosed by the compiler, and will be diagnosed at run-time in a debug build.
The allowed operations are:
assignment: with an unformed object on the left-hand side
destruction: zero, one, or many times
address-of: including passing the address of an unformed object around
Disallowed operations include:
comparing
passing to a function
returning from a function
initializing or assigning as an unformed object on the right-hand side
Some examples:
var unformed_x: i32;
// ❌ Each line mentioning `unformed_x` below is an error:
var a: i32 = unformed_x;
var b: i32 = 42;
b = unformed_x;
var unformed_c: i32;
unformed_c = unformed_x;
unformed_x = unformed_x;
Each type has the option to say how its unformed state is represented. Types that don't define their unformed state must be initialized when declared. If that isn't possible or desirable, they can be explicitly wrapped by the user in an Optional, which tracks whether the wrapped object has been initialized at run-time [1]. Types can also use this approach internally when needed.
Addresses and runtime errors
Above, we noted that an unformed object can have its address taken and that address can be passed to functions.
In this situation, the compiler can no longer be sure what happens. Further actions on that object will not cause compile-time errors, but can be erroneous at runtime.
fn Example() {
var resource: Resource;
// ❌ This will be a compile error because resource is unformed
doStuffWithResource(resource);
if (acquireResource(&resource)) {
doStuffWithResource(resource);
}
// ❌ This may be a runtime error because resource might be unformed
doStuffWithResource(resource);
// Unconditionally calls destructor
return;
}
Destructors are idempotent---you can call them any number of times on unformed or previously-destroyed objects without error.
In the above example, there is a dangerous call to doStuffWithResource that the compiler cannot detect. It's useful to be able to optionally detect these situations, or else avoid them entirely. Carbon provides different build modes for different trade-offs.
Debug: used during development, and emphasizes detection and debuggability. At run time, it will produce an error for any invalid use of unformed state.
Performance: delivers the highest performance, invalid uses of unformed state that are not detected at build time are undefined behavior.
Hardened: for released binaries willing to sacrifice a small amount of performance to gain strong safety against attacks; invalid uses of unformed state are detected or mitigated. Mitigation for some types could include providing a consistent known value at initialization time.
In Carbon, a moved-from variable becomes unformed. Moved-from variables are then treated the same as uninitialized variables. As a result, Carbon's move semantics occupy a middle ground between C++-style move semantics and Rust's "destructive" move semantics.
And there's more
As we mentioned above, two of Carbon's core principles are safety and performance, and the design for unformed state addresses both. With these innovations, Carbon should catch many more errors than today's C++ without losing the expressivity and convenience that one expects from modern languages. While uninitialized memory and objects are just the tip of the iceberg of safety that Carbon will eventually need to address, this issue remains an important one, and one where we think we can improve immediately without risking C++ interop.
There's a lot more to know about unformed state if you're curious. Feel free to comment on this post if you have questions!
If you want more current discussion, check out the weekly meeting notes from the Carbon Weekly Sync.
Wrap-up
Don't forget to subscribe! You can join [email protected]. If you have comments or would like to contribute to future editions of Carbon Copy, please reach out. And, always, join us any way you can!
Carbonatedly yours,
Wolff, Josh, and the Carbon team
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Carbon Copy, April 2024
Here is the new Carbon Copy, your periodic update on the Carbon language!
Carbon Copy is designed for those people who want a high-level view of what's happening on the project. If you'd like to subscribe, you can join [email protected]. Carbon Copy should arrive roughly every other month.
Spotlight: Unformed state
In today's spotlight, let's talk about a common problem: operating on uninitialized variables.
Where might you encounter uninitialized values?
In Carbon, safety and performance are both priorities. For safety, it should be difficult or impossible to operate on an uninitialized variable. When performance is critical, however, you don't want to pay any penalty for this safety.
A solution some languages use is to automatically initialize or require initializers. For Carbon, there are concerns with this approach, such as:
Another approach is definitive initialization, which means statically proving that the variable is initialized on every control flow path. This has disadvantages, including being sensitive to the compiler's control-flow analysis---if the compiler implementation changes, it could disallow existing code unexpectedly. Also, certain refactorings like moving code into functions either would be forbidden or require much more aggressive and expensive control-flow analysis.
Recently, C++ introduced P2795R5, which indicates that reading an uninitialized value is now erroneous behavior, a new C++ concept distinct from undefined behavior. This means implementations can provide diagnosable errors in response. If such a usage is not diagnosed you will get the defined behavior of returning an implementation-defined value.
Carbon's solution is unformed state. Unformed state is the state of an object before it is initialized. While unformed, only a limited set of operations are allowed. Disallowed operations on an unformed object can usually be diagnosed by the compiler, and will be diagnosed at run-time in a debug build.
The allowed operations are:
Disallowed operations include:
Some examples:
Each type has the option to say how its unformed state is represented. Types that don't define their unformed state must be initialized when declared. If that isn't possible or desirable, they can be explicitly wrapped by the user in an
Optional
, which tracks whether the wrapped object has been initialized at run-time [1]. Types can also use this approach internally when needed.Addresses and runtime errors
Above, we noted that an unformed object can have its address taken and that address can be passed to functions.
In this situation, the compiler can no longer be sure what happens. Further actions on that object will not cause compile-time errors, but can be erroneous at runtime.
Destructors are idempotent---you can call them any number of times on unformed or previously-destroyed objects without error.
In the above example, there is a dangerous call to
doStuffWithResource
that the compiler cannot detect. It's useful to be able to optionally detect these situations, or else avoid them entirely. Carbon provides different build modes for different trade-offs.In Carbon, a moved-from variable becomes unformed. Moved-from variables are then treated the same as uninitialized variables. As a result, Carbon's move semantics occupy a middle ground between C++-style move semantics and Rust's "destructive" move semantics.
And there's more
As we mentioned above, two of Carbon's core principles are safety and performance, and the design for unformed state addresses both. With these innovations, Carbon should catch many more errors than today's C++ without losing the expressivity and convenience that one expects from modern languages. While uninitialized memory and objects are just the tip of the iceberg of safety that Carbon will eventually need to address, this issue remains an important one, and one where we think we can improve immediately without risking C++ interop.
There's a lot more to know about unformed state if you're curious. Feel free to comment on this post if you have questions!
References:
Upcoming 2024 talks
Top proposals and issues from Feb/March 2024
Other notes
If you want more current discussion, check out the weekly meeting notes from the Carbon Weekly Sync.
Wrap-up
Don't forget to subscribe! You can join [email protected]. If you have comments or would like to contribute to future editions of Carbon Copy, please reach out. And, always, join us any way you can!
Carbonatedly yours,
Wolff, Josh, and the Carbon team
Beta Was this translation helpful? Give feedback.
All reactions