Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaybeUninit requires T: Sized but it should not #80158

Open
mahkoh opened this issue Dec 18, 2020 · 17 comments
Open

MaybeUninit requires T: Sized but it should not #80158

mahkoh opened this issue Dec 18, 2020 · 17 comments
Labels
A-raw-pointers Area: raw pointers, MaybeUninit, NonNull T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@mahkoh
Copy link
Contributor

mahkoh commented Dec 18, 2020

I tried this code:

pub fn as_maybe_uninit<T: ?Sized>(t: &T) -> &MaybeUninit<T> {
    unsafe {
        mem::transmute(t)
    }
}

I expected to see this happen: It compiles

Instead, this happened: It does not compile

Meta

rustc --version --verbose:

rustc 1.48.0 (7eac88abb 2020-11-16)
binary: rustc
commit-hash: 7eac88abb2e57e752f3302f02be5f3ce3d7adfb4
commit-date: 2020-11-16
host: x86_64-unknown-linux-gnu
release: 1.48.0
LLVM version: 11.0
@mahkoh mahkoh added the C-bug Category: This is a bug. label Dec 18, 2020
@camelid
Copy link
Member

camelid commented Dec 18, 2020

@mahkoh Can you explain why you think MaybeUninit should not require T: Sized?

@camelid camelid added A-raw-pointers Area: raw pointers, MaybeUninit, NonNull T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed C-bug Category: This is a bug. labels Dec 18, 2020
@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 18, 2020

Being MaybeUninit is not a property that has anything to do with being Sized but with the contents of the occupied memory.

The functions

for<T: Sized> &mut T -> &mut MaybeUninit<T>						(unsafe) (should not require Sized)
for<T: Sized> &mut MaybeUninit<T> -> &mut [MaybeUninit<u8>]		(safe)   (should not require Sized)

occur naturally in APIs that are generic over both uninitialized and initialized arguments.

For example, you want to be able to write

read_into(&mut [0u8]);
read_into(&mut [MaybeUninit::uninit()]);
read_into(&mut some_data_structure_that_can_be_read_into);

with the same read function. In this case T = [?] is unsized.

Because of the T: Sized restriction, the above primitives cannot be composed to create

for<T: ?Sized> &mut T -> &mut [MaybeUninit<u8>]  (unsafe)

This is not a T-libs. I'm almost certain that this restriction is caused by the implementation detail of MaybeUninit being implemented using unions which require all variants to be sized. Whether that restriction is necessary is another question.

@camelid camelid added T-lang Relevant to the language team, which will review and decide on the PR/issue. I-nominated labels Dec 18, 2020
@camelid
Copy link
Member

camelid commented Dec 18, 2020

Nominating for T-lang/T-libs discussion.

@HeroicKatora
Copy link
Contributor

The unsizing would be valuable for allocators. In some cases, one wants to use UnsafeCell<MaybeUninit<T>> as a generic pool of memory where the alignment and size requirements are controlled by the type parameter T, and thus by the caller. The ability to use a DST here would expand this pattern such that the length can be a runtime value. The currently best choice of using a [MaybeUninit<u8>] is not exactly satisfactor because it can not be parameterized such that it has a sized equivalent.

@RalfJung
Copy link
Member

Allowing unsized MaybeUninit is in conflict with ever having proper support for things like "thin trait objects" or user-defined custom DSTs. There, it would be impossible to determine the size of &MaybeUninit<ThinDst>.

So, while this could in principle be implemented right now (though it would need a special case since I don't see a way to support unsized union fields in general), this would also close the door for some other potential future language extensions.

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 20, 2020

Allowing unsized MaybeUninit is in conflict with ever having proper support for things like "thin trait objects" or user-defined custom DSTs. There, it would be impossible to determine the size of &MaybeUninit<ThinDst>.

How is MaybeUninit<T: ?Sized> different from any other C<T: ?Sized> in this regard?

@RalfJung
Copy link
Member

RalfJung commented Dec 20, 2020

The difference is that it may not be initialized. Determining the size of an &ThinDst requires actually dereferencing the thin pointer, finding the vtable ptr at the target, and getting the size from that. With &MaybeUninit<ThinDst>, the pointee (and, in particular, its vtable ptr) might not be initialized. So suddenly, size_of_val becomes unsafe on a reference.

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 20, 2020

I had read somewhere that size_of_val would not be allowed to access the data pointer even in custom DSTs. Maybe that was outdated information.

That aside, you also said that there would be a problem with unsized union fields. I did not see any such problems so I opened rust-lang/rfcs#3041 yesterday. Maybe leave an explanation there.

@HeroicKatora
Copy link
Contributor

HeroicKatora commented Dec 20, 2020

Would it be possible to make MaybeUninit<T: ?Sized> a valid type but have it be unsafe to produce any values of that type? Relaxing the parameter of the type does not necessarily mean relaxing the parameter of all its current methods, including construction. Then it would fall onto the caller to proof that &MaybeUninit<ThinDst> does not inspect the memory for its size and alignment information. But crucially the cases of MaybeUninit<[u8]> and MaybeUninit<str> might get special treatment that allows their construction/unsizing.

@RalfJung
Copy link
Member

I had read somewhere that size_of_val would not be allowed to access the data pointer even in custom DSTs. Maybe that was outdated information.

I mean, this is all hypothetical since custom DST aren't actually a thing.^^ But determining the size will be crucial at least to support things like Box<ThinDst>. So, Box<MaybeUninit<ThinsDst>> becomes an impossibility because Box needs to know the size (on drop) but a maybe-uninitialized thin DST does not have a well-defined size.

Would it be possible to make MaybeUninit<T: ?Sized> a valid type but have it be unsafe to produce any values of that type?

I don't see how that would help with the size issue. It would have to be unsafe to ask for the value's size.

@HeroicKatora
Copy link
Contributor

HeroicKatora commented Dec 20, 2020

I don't see how that would help with the size issue. It would have to be unsafe to ask for the value's size.

Crucially in the same step as that relaxation it could be defined that producing a value MaybeUninit<ThinDst>, i.e. a safety (edit: or even validity, but safety might just be enough) invariant of that type, to be safe only if it is safe to ask for its size/layout. That is, it is only safe for DSTs that do not actually inspect the memory representation. Then the value itself is proof that this invariant holds and that the size can be computed.

@RalfJung
Copy link
Member

Crucially in the same step as that relaxation it could be defined that producing a value MaybeUninit, i.e. a safety (edit: or even validity, but safety might just be enough) invariant of that type

Having any safety (or validity!) invariant associated with MaybeUninit<T> entirely subverts the point of that type, doesn't it? The entire point is to be able to handle potentially uninitialized data. Even if we could somehow consistently define all this, now the rules for MaybeUninit would be very murky since it has to be (patially) initialized in some cases.

@HeroicKatora
Copy link
Contributor

The notion that MaybeUninit's type is characterized by always being valid under all of its layouts seems useful, yes. The most immediately useful dynamically sized types—[T], str, dyn Trait—would also have no invariants. What about the possibility of relaxing the type parameter's bound to an unsafe trait capturing this condition (where such any impl would require compiler support for dyn traits)? For example:

The layout of the pointee *const T must not depend on the contents of the memory pointed to.

@RalfJung
Copy link
Member

So you are basically saying, MaybeUninit can be used with some unsized types but not all of them?

Something like that could work in principle. How practical and ergonomic it would be... that's hard to tell.

@cramertj
Copy link
Member

cramertj commented Jan 5, 2021

We discussed this in the lang team meeting today, and the general consensus was that we'd like to make sure we understand the path forward for using this with custom DSTs prior to expanding the abilities of MaybeUninit.

It'd be helpful to see an example using one of the previous custom DST designs combined with a hypothetically-expanded MaybeUninit API to see how they interact. The most recent such proposal I'm aware of is rust-lang/rfcs#2594 (Rendered link).

@PlasmaPower
Copy link
Contributor

I ran into this when experimenting with creating an aligned slice type:

#[repr(align(32))]
#[derive(Debug)]
struct AlignedSlice([u8]);

impl AlignedSlice {
    pub unsafe fn new_uninit(size: usize) -> Box<MaybeUninit<AlignedSlice>> {
        unsafe {
            let layout = Layout::from_size_align_unchecked(size, 32);
            let ptr = std::alloc::System.alloc(layout);
            Box::from_raw(std::slice::from_raw_parts_mut(ptr, size) as *mut [MaybeUninit<u8>] as *mut MaybeUninit<AlignedSlice>)
        }
    }

    // TODO use MaybeUninit::assume_init when stable for this
    pub unsafe fn assume_init(this: Box<MaybeUninit<AlignedSlice>>) -> Box<AlignedSlice> {
        unsafe {
            Box::from_raw(this.into_raw() as *mut AlignedSlice)
        }
    }

    pub fn new(data: &[u8]) -> Box<AlignedSlice> {
        let mut this = Self::new_uninit(data.len());
        std::ptr::copy_nonoverlapping(data.as_ptr(), this.as_ptr_mut(), data.len());
        Self::assume_init(this)
    }
}

MaybeUninit would have been a great abstraction for this, but requiring T: Sized makes it inviable

@Rua
Copy link
Contributor

Rua commented May 18, 2024

If it's undesirable to make MaybeUninit unsized because of its useful guarantees, what about a separate MaybeUnitUnsized type or similar, with methods to convert between the two where appropriate? Would that be more useful?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-raw-pointers Area: raw pointers, MaybeUninit, NonNull T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants