Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.pow(n) does not optimize well #47234

Closed
hanna-kruppe opened this issue Jan 6, 2018 · 8 comments · Fixed by #119911
Closed

2.pow(n) does not optimize well #47234

hanna-kruppe opened this issue Jan 6, 2018 · 8 comments · Fixed by #119911
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@hanna-kruppe
Copy link
Contributor

A mibbit user noted in #rust-beginners that the integer pow method with a base of two generates far worse code than a shift: https://godbolt.org/g/mE1wjH (the shift is not equivalent for n >= 64, but you can see that it would still be better code if you accounted for that)

Perhaps we can special case the base 2 and use a shift in that case? One complication is that pow is subject to overflow checks: Whether it panics or returns 0 for excessive exponents depends on whether overflow checks are enabled, and I don't know of a way to branch depending on whether they are enabled (which is required for this optimization to not change behavior).

Alternatively, better LLVM optimizations could help, but this might be a bit too much to ask.

@nagisa
Copy link
Member

nagisa commented Jan 6, 2018

LLVM doesn’t optimise even a

pub fn dopow(mut exp: u32) -> u64 {
            let mut acc = 1;
            while exp > 0 {
                acc = acc * 2;
                exp -= 1;
            }
            acc
}

so I don’t see it being able to optimise the more complicated algorithm used currently.

@est31
Copy link
Member

est31 commented Jan 6, 2018

Sooo... that means that there should be a MIR optimisation that checks whether the function is libcore::num::<sth>::pow and then checks whether the base is a constant expression and a power of two and if is, it should replace the expression by a shift instead?

Or is this rather solved on the llvm side by having a pow intrinsic that works on integers?

@hanna-kruppe
Copy link
Contributor Author

A MIR pass would be a pretty big hammer. We'd have to make this random library function specially recognized by the compiler. Also the implementation effort wouldn't be justified for a one-off optimization.

I think the closest thing to a practical solution would be changing the library code that implements pow. The problem with that is that it can't match the current algorithm wrt overflow checks. But that doesn't seem like it's unique to this function, surely other code runs into the same problem? Might be worth adding a way to query the status of overflow checks. Historically cfg!(debug_assertion) was that way (if you ignored -Z flags), but now we also have -C overflow-checks.

@Mark-Simulacrum Mark-Simulacrum added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Jan 10, 2018
@XAMPPRocky XAMPPRocky added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 10, 2018
@kennytm kennytm added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Feb 2, 2020
@e00E
Copy link
Contributor

e00E commented Jul 27, 2021

As of 1.53.0 and today's nightly the bug still occurs: https://godbolt.org/z/Kjn4odTdh

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@NCGThompson
Copy link
Contributor

As of 1.74.0, Rust still isn't able to optimize 2u32.wrapping_pow(u). However, since 1.60.0, it is able to optimize 1u32.wrapping_pow(u).

If the leading zeroes of the base are removed before exponentiation with a right shift and then added back with a left shift after exponentiation, then the 2u32.wrapping_pow(u) becomes a 1u32.wrapping_pow(u) and therefor can be optimized. In fact, it gives equivalent assembly to 1u32.checked_shl(u).unwrap_or_(0) However, removing the zeroes only significantly helps in with special cases like a constant power of two, so it isn't worth implementing.

Here is an example. This wasn't thoroughly checked for logic errors: Compiler Explorer link

@nikic
Copy link
Contributor

nikic commented Dec 15, 2023

#114390 is the fix for this issue, but it doesn't have recent activity.

@NCGThompson
Copy link
Contributor

#114390 is the fix for this issue, but it doesn't have recent activity.

@nikic Should I take it over or should I give the pull requester more time?

@NCGThompson
Copy link
Contributor

One complication is that pow is subject to overflow checks: Whether it panics or returns 0 for excessive exponents depends on whether overflow checks are enabled,

We can fix that now that we have #[track_caller]. We will be further able to improve this further once we have #[cfg(overflow_checks)].

bors added a commit to rust-lang-ci/rust that referenced this issue Jan 20, 2024
…li-obk

Replacement of rust-lang#114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two

This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See rust-lang#114390 for more discussion.

While I have extended the scope of the power of two optimization from rust-lang#114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.

Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).

Fixes rust-lang#47234
Resolves rust-lang#114390
`@Centri3`
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 23, 2024
…li-obk

Replacement of rust-lang#114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two

This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See rust-lang#114390 for more discussion.

While I have extended the scope of the power of two optimization from rust-lang#114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.

Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).

Fixes rust-lang#47234
Resolves rust-lang#114390
`@Centri3`
@bors bors closed this as completed in 039d887 Jan 25, 2024
bjorn3 pushed a commit to bjorn3/rust that referenced this issue Jan 26, 2024
…li-obk

Replacement of rust-lang#114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two

This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See rust-lang#114390 for more discussion.

While I have extended the scope of the power of two optimization from rust-lang#114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.

Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).

Fixes rust-lang#47234
Resolves rust-lang#114390
`@Centri3`
GuillaumeGomez pushed a commit to GuillaumeGomez/rust that referenced this issue Feb 21, 2024
…li-obk

Replacement of rust-lang#114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two

This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See rust-lang#114390 for more discussion.

While I have extended the scope of the power of two optimization from rust-lang#114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.

Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).

Fixes rust-lang#47234
Resolves rust-lang#114390
`@Centri3`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
10 participants