Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework std::sys::windows::alloc #83065

Merged
merged 5 commits into from
Apr 2, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 54 additions & 16 deletions library/std/src/sys/windows/alloc.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
#![deny(unsafe_op_in_unsafe_fn)]

use crate::alloc::{GlobalAlloc, Layout, System};
use crate::ffi::c_void;
use crate::ptr;
use crate::sync::atomic::{AtomicPtr, Ordering};
use crate::sys::c;
use crate::sys_common::alloc::{realloc_fallback, MIN_ALIGN};

Expand All @@ -17,6 +19,9 @@ const HEAP_ZERO_MEMORY: c::DWORD = 0x00000008;
extern "system" {
// Get a handle to the default heap of the current process, or null if the operation fails.
//
// SAFETY: Successful calls to this function within the same process are assumed to
// always return the same handle, which remains valid for the entire lifetime of the process.
//
// See https://docs.microsoft.com/windows/win32/api/heapapi/nf-heapapi-getprocessheap
fn GetProcessHeap() -> c::HANDLE;

Expand Down Expand Up @@ -66,29 +71,59 @@ extern "system" {
// Returns a nonzero value if the operation is successful, and zero if the operation fails.
//
// SAFETY:
// - `hHeap` must be a non-null handle returned by `GetProcessHeap`.
// - `dwFlags` must be set to zero.
// - `lpMem` must be a pointer to an allocated block returned by `HeapAlloc` or `HeapReAlloc`,
// that has not already been freed.
// If the block was successfully freed, pointers pointing to the freed memory, such as `lpMem`,
// must not be dereferenced ever again.
//
// Note that both `hHeap` is allowed to be any value, and `lpMem` is allowed to be null,
// both of which will not cause the operation to fail.
// Note that `lpMem` is allowed to be null, which will not cause the operation to fail.
//
// See https://docs.microsoft.com/windows/win32/api/heapapi/nf-heapapi-heapfree
fn HeapFree(hHeap: c::HANDLE, dwFlags: c::DWORD, lpMem: c::LPVOID) -> c::BOOL;
}

// Cached handle to the default heap of the current process.
// Either a non-null handle returned by `GetProcessHeap`, or null when not yet initialized or `GetProcessHeap` failed.
static HEAP: AtomicPtr<c_void> = AtomicPtr::new(ptr::null_mut());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe std is already guaranteed to have AtomicPtr available.


// Get a handle to the default heap of the current process, or null if the operation fails.
// SAFETY: If this operation is successful, `HEAP` will be successfully initialized and contain
// a non-null handle returned by `GetProcessHeap`.
#[inline]
unsafe fn init_or_get_process_heap() -> c::HANDLE {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear why this is unsafe to call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the unsafety of GetProcessHeap and the role it plays in correctly initializing HEAP which the realloc and dealloc code relies on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That isn't what unsafe to call means; unsafe to call means the caller is responsible for establishing some precondition which is not enforced by the type system. What is the precondition the caller needs in order for a call to init_or_get_process_heap to be correct? That would need to be documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added get_process_heap and updated some safety comments: get_process_heap is unsafe because it assumes HEAP has already been initialized, init_or_get_process_heap is not unsafe as there are no preconditions.

let heap = HEAP.load(Ordering::Relaxed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the ordering here be Acquire?

if heap.is_null() {
// `HEAP` has not yet been successfully initialized
let heap = unsafe { GetProcessHeap() };
if !heap.is_null() {
// SAFETY: No locking is needed because within the same process,
// successful calls to `GetProcessHeap` will always return the same value, even on different threads.
HEAP.store(heap, Ordering::Relaxed);

// SAFETY: `HEAP` contains a non-null handle returned by `GetProcessHeap`
heap
} else {
// Could not get the current process heap.
ptr::null_mut()
}
} else {
// SAFETY: `HEAP` contains a non-null handle returned by `GetProcessHeap`
heap
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this implementation to be sound, even when called simultaneously by different threads, but would like a double check.

Copy link
Contributor Author

@CDirkx CDirkx Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just OnceCell is not enough to implement this because it isn't Sync. I could use Once, SyncOnceCell or SyncLazy, but those are all provided by std, and in general it is preferred to not use std types in sys. Is there any other way? @dtolnay what are your thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into the source of OnceCell and found that it is only a UnsafeCell<Option<T>>. Therefore I think it is OK to use AtomicPtr here. As for the possible multiple assignment, I suggest using fetch_update.

Copy link
Contributor

@Berrysoft Berrysoft Mar 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example,

#[inline]
unsafe fn init_or_get_process_heap() -> c::HANDLE {
    HEAP.fetch_update(Ordering::Relaxed, Ordering::Relaxed, |heap| {
        if heap.is_null() {
            let heap = unsafe { GetProcessHeap() };
            if heap.is_null() {
                None
            } else {
                Some(heap)
            }
        } else {
            Some(heap)
        }
    }).unwrap_or(ptr::null_mut())
}

GetProcessHeap may be called multiple times, but assignment will be only once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm concerned about is the performance. At least from the benchmarks I did, using a mutex or fetch_update is slower than just calling GetProcessHeap every time:

Benchmarks
#![feature(test)]
#![feature(atomic_fetch_update)]

extern crate test;

use std::sync::atomic::{AtomicPtr, Ordering};
use std::ffi::c_void;
use core::ptr;
use std::sync::{Mutex, Once};

pub type HANDLE = LPVOID;
pub type LPVOID = *mut std::ffi::c_void;

extern "system" {
    pub fn GetProcessHeap() -> HANDLE;
}

#[inline]
fn bench<F: Fn() -> HANDLE>(bencher: &mut test::bench::Bencher, f: F) {
    bencher.iter(|| {
        for _ in 0..1000 {
            let heap = f();
            assert!(!heap.is_null());
            test::black_box(heap);
        }
    })
}

#[bench]
fn syscall(bencher: &mut test::bench::Bencher) {
    bench(bencher, || {
        unsafe { GetProcessHeap() }
    })
}

#[bench]
fn once(bencher: &mut test::bench::Bencher) {
    static INIT: Once = Once::new();
    static HEAP: AtomicPtr<c_void> = AtomicPtr::new(ptr::null_mut());

    bench(bencher, || {
        INIT.call_once(|| { HEAP.store(unsafe { GetProcessHeap() }, Ordering::Relaxed) });
        HEAP.load(Ordering::Relaxed)
    })
}

#[bench]
fn load_store(bencher: &mut test::bench::Bencher) {
    static HEAP: AtomicPtr<c_void> = AtomicPtr::new(ptr::null_mut());

    bench(bencher, || {
        let heap = HEAP.load(Ordering::Relaxed);
        if heap.is_null() {
            let heap = unsafe { GetProcessHeap() };
            if !heap.is_null() {
                HEAP.store(heap, Ordering::Relaxed);
                heap
            } else {
                ptr::null_mut()
            }
        } else {
            heap
        }
    })
}

#[bench]
fn fetch_update(bencher: &mut test::bench::Bencher) {
    static HEAP: AtomicPtr<c_void> = AtomicPtr::new(ptr::null_mut());

    bench(bencher, || {
        let _old = HEAP.fetch_update(Ordering::Relaxed, Ordering::Relaxed, |heap| {
            if heap.is_null() {
                let heap = unsafe { GetProcessHeap() };
                if heap.is_null() {
                    None
                } else {
                    Some(heap)
                }
            } else {
                Some(heap)
            }
        });

        HEAP.load(Ordering::Acquire)
    })
}

#[bench]
fn mutex(bencher: &mut test::bench::Bencher) {
    let mutex: Mutex<()> = Mutex::new(()); // would use a static sys_common::mutex::StaticMutex
    static mut HEAP: HANDLE = ptr::null_mut();

    bench(bencher, || {
        unsafe {
            let _lock = mutex.lock();
            if HEAP.is_null() {
                HEAP = GetProcessHeap();
            }

            HEAP
        }
    })
}

Results:

test fetch_update ... bench:      10,205 ns/iter (+/- 930)
test load_store   ... bench:         270 ns/iter (+/- 27)
test mutex        ... bench:      12,145 ns/iter (+/- 788)
test once         ... bench:         495 ns/iter (+/- 24)
test syscall      ... bench:       2,018 ns/iter (+/- 221)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that's fair.


// Header containing a pointer to the start of an allocated block.
// SAFETY: size and alignment must be <= `MIN_ALIGN`.
// SAFETY: Size and alignment must be <= `MIN_ALIGN`.
#[repr(C)]
struct Header(*mut u8);

// Allocates a block of optionally zeroed memory for a given `layout`.
// Returns a pointer satisfying the guarantees of `System` about allocated pointers.
// Allocate a block of optionally zeroed memory for a given `layout`.
// SAFETY: Returns a pointer satisfying the guarantees of `System` about allocated pointers.
#[inline]
unsafe fn allocate(layout: Layout, zeroed: bool) -> *mut u8 {
let heap = unsafe { GetProcessHeap() };
let heap = unsafe { init_or_get_process_heap() };
if heap.is_null() {
// Allocation has failed, could not get the current process heap.
return ptr::null_mut();
Expand Down Expand Up @@ -147,14 +182,14 @@ unsafe fn allocate(layout: Layout, zeroed: bool) -> *mut u8 {
unsafe impl GlobalAlloc for System {
#[inline]
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
// SAFETY: pointers returned by `allocate` satisfy the guarantees of `System`
// SAFETY: Pointers returned by `allocate` satisfy the guarantees of `System`
let zeroed = false;
unsafe { allocate(layout, zeroed) }
}

#[inline]
unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
// SAFETY: pointers returned by `allocate` satisfy the guarantees of `System`
// SAFETY: Pointers returned by `allocate` satisfy the guarantees of `System`
let zeroed = true;
unsafe { allocate(layout, zeroed) }
}
Expand All @@ -173,24 +208,27 @@ unsafe impl GlobalAlloc for System {
}
};

// SAFETY: because `ptr` has been successfully allocated with this allocator,
// `HEAP` must have been successfully initialized and contain a non-null handle
// returned by `GetProcessHeap`.
let heap = HEAP.load(Ordering::Relaxed);

// SAFETY: `block` is a pointer to the start of an allocated block.
unsafe {
let err = HeapFree(GetProcessHeap(), 0, block as c::LPVOID);
let err = HeapFree(heap, 0, block as c::LPVOID);
debug_assert!(err != 0, "Failed to free heap memory: {}", c::GetLastError());
Copy link
Contributor Author

@CDirkx CDirkx Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation of GlobalAlloc has the point:

  • It's undefined behavior if global allocators unwind. This restriction may be lifted in the future, but currently a panic from any of these functions may lead to memory unsafety.

So I removed the debug_assert

}
}

#[inline]
unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {
if layout.align() <= MIN_ALIGN {
let heap = unsafe { GetProcessHeap() };
if heap.is_null() {
// Reallocation has failed, could not get the current process heap.
return ptr::null_mut();
}
// SAFETY: because `ptr` has been successfully allocated with this allocator,
// `HEAP` must have been successfully initialized and contain a non-null handle
// returned by `GetProcessHeap`.
let heap = HEAP.load(Ordering::Relaxed);

// SAFETY: `heap` is a non-null handle returned by `GetProcessHeap`,
// `ptr` is a pointer to the start of an allocated block.
// SAFETY: `ptr` is a pointer to the start of an allocated block.
// The returned pointer points to the start of an allocated block.
unsafe { HeapReAlloc(heap, 0, ptr as c::LPVOID, new_size) as *mut u8 }
} else {
Expand Down