2.pow(n) does not optimize well #47234

hanna-kruppe · 2018-01-06T18:31:37Z

A mibbit user noted in #rust-beginners that the integer pow method with a base of two generates far worse code than a shift: https://godbolt.org/g/mE1wjH (the shift is not equivalent for n >= 64, but you can see that it would still be better code if you accounted for that)

Perhaps we can special case the base 2 and use a shift in that case? One complication is that pow is subject to overflow checks: Whether it panics or returns 0 for excessive exponents depends on whether overflow checks are enabled, and I don't know of a way to branch depending on whether they are enabled (which is required for this optimization to not change behavior).

Alternatively, better LLVM optimizations could help, but this might be a bit too much to ask.

The text was updated successfully, but these errors were encountered:

nagisa · 2018-01-06T20:19:23Z

LLVM doesn’t optimise even a

pub fn dopow(mut exp: u32) -> u64 {
            let mut acc = 1;
            while exp > 0 {
                acc = acc * 2;
                exp -= 1;
            }
            acc
}

so I don’t see it being able to optimise the more complicated algorithm used currently.

est31 · 2018-01-06T20:24:06Z

Sooo... that means that there should be a MIR optimisation that checks whether the function is libcore::num::<sth>::pow and then checks whether the base is a constant expression and a power of two and if is, it should replace the expression by a shift instead?

Or is this rather solved on the llvm side by having a pow intrinsic that works on integers?

hanna-kruppe · 2018-01-06T20:44:48Z

A MIR pass would be a pretty big hammer. We'd have to make this random library function specially recognized by the compiler. Also the implementation effort wouldn't be justified for a one-off optimization.

I think the closest thing to a practical solution would be changing the library code that implements pow. The problem with that is that it can't match the current algorithm wrt overflow checks. But that doesn't seem like it's unique to this function, surely other code runs into the same problem? Might be worth adding a way to query the status of overflow checks. Historically cfg!(debug_assertion) was that way (if you ignored -Z flags), but now we also have -C overflow-checks.

e00E · 2021-07-27T19:41:55Z

As of 1.53.0 and today's nightly the bug still occurs: https://godbolt.org/z/Kjn4odTdh

NCGThompson · 2023-12-15T04:25:58Z

As of 1.74.0, Rust still isn't able to optimize 2u32.wrapping_pow(u). However, since 1.60.0, it is able to optimize 1u32.wrapping_pow(u).

If the leading zeroes of the base are removed before exponentiation with a right shift and then added back with a left shift after exponentiation, then the 2u32.wrapping_pow(u) becomes a 1u32.wrapping_pow(u) and therefor can be optimized. In fact, it gives equivalent assembly to 1u32.checked_shl(u).unwrap_or_(0) However, removing the zeroes only significantly helps in with special cases like a constant power of two, so it isn't worth implementing.

Here is an example. This wasn't thoroughly checked for logic errors: Compiler Explorer link

nikic · 2023-12-15T08:31:59Z

#114390 is the fix for this issue, but it doesn't have recent activity.

NCGThompson · 2023-12-16T19:02:01Z

#114390 is the fix for this issue, but it doesn't have recent activity.

@nikic Should I take it over or should I give the pull requester more time?

NCGThompson · 2024-01-18T01:57:22Z

One complication is that pow is subject to overflow checks: Whether it panics or returns 0 for excessive exponents depends on whether overflow checks are enabled,

We can fix that now that we have #[track_caller]. We will be further able to improve this further once we have #[cfg(overflow_checks)].

…li-obk Replacement of rust-lang#114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See rust-lang#114390 for more discussion. While I have extended the scope of the power of two optimization from rust-lang#114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests. Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage). Fixes rust-lang#47234 Resolves rust-lang#114390 `@Centri3`

Mark-Simulacrum added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Jan 10, 2018

XAMPPRocky added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 10, 2018

wdanilo mentioned this issue Dec 22, 2019

60 steps per second Physics Simulator (Kinematics, Spring, Air Dragging) enso-org/ide#91

Merged

3 tasks

kennytm mentioned this issue Feb 2, 2020

2^i should be optimized to 1 << i #68773

Closed

kennytm added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Feb 2, 2020

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

NCGThompson mentioned this issue Jan 12, 2024

Replacement of #114390: Add new intrinsic is_var_statically_known and optimize pow for powers of two #119911

Merged

bors closed this as completed in 039d887 Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.pow(n) does not optimize well #47234

2.pow(n) does not optimize well #47234

hanna-kruppe commented Jan 6, 2018

nagisa commented Jan 6, 2018

est31 commented Jan 6, 2018

hanna-kruppe commented Jan 6, 2018

e00E commented Jul 27, 2021

NCGThompson commented Dec 15, 2023

nikic commented Dec 15, 2023

NCGThompson commented Dec 16, 2023

NCGThompson commented Jan 18, 2024

2.pow(n) does not optimize well #47234

2.pow(n) does not optimize well #47234

Comments

hanna-kruppe commented Jan 6, 2018

nagisa commented Jan 6, 2018

est31 commented Jan 6, 2018

hanna-kruppe commented Jan 6, 2018

e00E commented Jul 27, 2021

NCGThompson commented Dec 15, 2023

nikic commented Dec 15, 2023

NCGThompson commented Dec 16, 2023

NCGThompson commented Jan 18, 2024