Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsafe precondition violated in the x86_64 SIMD implementation of str.contains #104726

Closed
pietroalbini opened this issue Nov 22, 2022 · 10 comments · Fixed by #104735
Closed

Unsafe precondition violated in the x86_64 SIMD implementation of str.contains #104726

pietroalbini opened this issue Nov 22, 2022 · 10 comments · Fixed by #104735
Labels
A-str Area: str and String C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-x86_64 Target: x86-64 processors (like x86_64-*) regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@pietroalbini
Copy link
Member

PR #103779 added a x86_64 SIMD-based implementation of str.contains(&needle) to optimize searching into strings when the needle is at most 32 bytes long. Unfortunately, that implementation doesn't seem to be sound.

When compiling the standard library with debug assertions on, running UI tests excluding src/test/ui/process/no-stdio.rs results in compiletest itself panicking before executing any test due to a debug_assert in libcore:

thread 'main' panicked at 'unsafe precondition(s) violated: slice::get_unchecked requires that the range is within the slice', library/core/src/panicking.rs:89:58
stack backtrace:
   0: rust_begin_unwind
             at .../library/std/src/panicking.rs:575:5
   1: core::panicking::panic_str_nounwind
             at .../library/core/src/panicking.rs:92:14
   2: <core::ops::range::Range<usize> as core::slice::index::SliceIndex<[T]>>::get_unchecked::runtime
   3: core::str::pattern::simd_contains::{{closure}}
             at .../library/core/src/str/pattern.rs:1787:27
   4: core::str::pattern::simd_contains
             at .../library/core/src/str/pattern.rs:1846:19
   5: <&str as core::str::pattern::Pattern>::is_contained_in
             at .../library/core/src/str/pattern.rs:965:43
   6: core::str::<impl str>::contains
             at .../library/core/src/str/mod.rs:1057:9
   7: test::filter_tests::{{closure}}
             at .../library/test/src/lib.rs:464:22
   8: test::filter_tests::{{closure}}::{{closure}}
             at .../library/test/src/lib.rs:475:59
   9: <core::slice::iter::Iter<T> as core::iter::traits::iterator::Iterator>::any
             at .../library/core/src/slice/iter/macros.rs:242:24
  10: test::filter_tests::{{closure}}
             at .../library/test/src/lib.rs:475:33
  11: alloc::vec::Vec<T,A>::retain::{{closure}}
             at .../library/alloc/src/vec/mod.rs:1561:32
  12: alloc::vec::Vec<T,A>::retain_mut::process_loop
             at .../library/alloc/src/vec/mod.rs:1641:21
  13: alloc::vec::Vec<T,A>::retain_mut
             at .../library/alloc/src/vec/mod.rs:1670:9
  14: alloc::vec::Vec<T,A>::retain
             at .../library/alloc/src/vec/mod.rs:1561:9
  15: test::filter_tests
             at .../library/test/src/lib.rs:475:9
  16: test::run_tests
             at .../library/test/src/lib.rs:297:17
  17: test::console::run_tests_console
             at /rustc/47395e0061d50df550cbd8b46dd46c132ea0c95a/library/test/src/console.rs:293:5
  18: compiletest::run_tests
             at /rustc/47395e0061d50df550cbd8b46dd46c132ea0c95a/src/tools/compiletest/src/main.rs:406:15
  19: compiletest::main
             at /rustc/47395e0061d50df550cbd8b46dd46c132ea0c95a/src/tools/compiletest/src/main.rs:57:5
  20: <fn() as core::ops::function::FnOnce<()>>::call_once
             at /rustc/47395e0061d50df550cbd8b46dd46c132ea0c95a/library/core/src/ops/function.rs:422:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread panicked while panicking. aborting.

To reproduce the issue, use the following configuration file:

profile = "compiler"
changelog-seen = 2

[rust]
debug-assertions-std = true

...and run:

./x test --stage 1 src/test/ui --exclude src/test/ui/process/no-stdio.rs

cc @the8472 @thomcc

@pietroalbini pietroalbini added O-x86_64 Target: x86-64 processors (like x86_64-*) regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness C-bug Category: This is a bug. I-prioritize Issue: Indicates that prioritization has been requested for this issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. A-str Area: str and String labels Nov 22, 2022
pietroalbini added a commit to ferrocene/rust that referenced this issue Nov 22, 2022
…=thomcc"

The current implementation seems to be unsound. See rust-lang#104726.
@pietroalbini
Copy link
Member Author

Tentatively opened a revert PR in #104727.

@the8472
Copy link
Member

the8472 commented Nov 22, 2022

Ugh. I ran it under miri with quite a combinatorial test that I thought should cover all the edge-cases. 😞
It's not immediately obvious what's wrong, so it's fine to land the revert for now.

@thomcc
Copy link
Member

thomcc commented Nov 22, 2022

Last I checked (and a lot has happened since then so I could be working on outdated info -- @RalfJung might be able to confirm), miri can't catch get_unchecked failure in many cases I believe, since it's "just" library UB.

@the8472
Copy link
Member

the8472 commented Nov 22, 2022

the memory will be accessed after the get_unchecked so it should still be an invalid memory access when it's out of bounds. Unless I'm just doing some shortening slicing and then extending it again so that it's valid within the same allocation but not correct slicing.

@RalfJung
Copy link
Member

RalfJung commented Nov 22, 2022

What @thomcc says is correct. The error here is in a situation like

let arr = [0;32];
let slice = &arr[0..8];
slice.get_unchecked(16);

Miri (without Stacked Borrows) will not complain about that since the index is in-bounds of the allocation. But the debug assertion will complain since you are causing library UB by going out-of-bounds of the slice.

I don't quite understand your last comment, @the8472.

@the8472
Copy link
Member

the8472 commented Nov 22, 2022

It seems like the test is just insufficient. Even with debug asserts and overflow checks enabled it doesn't catch it. I'll have to reproduce it first to figure out what's missing.

@the8472
Copy link
Member

the8472 commented Nov 22, 2022

src/test/ui/process/no-stdio.rs

Ah, probably the probing for alternative tail bytes when the first and last byte in the needle are the same.

@matthiaskrgr
Copy link
Member

Ooh, I was getting absolutely unexplainable compiler errors when building stage 1 libstd with a stage 0 rustc which was built with-Ctarget-cpu=native however, could this be the cause for this? 😅

@the8472
Copy link
Member

the8472 commented Nov 22, 2022

Possible, but it would also do incorrect things on x86-64 baseline

@the8472
Copy link
Member

the8472 commented Nov 22, 2022

#104735 has a proper fix

@bors bors closed this as completed in ff8c8df Nov 22, 2022
@apiraino apiraino removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-str Area: str and String C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-x86_64 Target: x86-64 processors (like x86_64-*) regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants