Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double.CompareTo() performance hurt by not being inlined #56493

Closed
rickbrew opened this issue Jul 28, 2021 · 18 comments
Closed

Double.CompareTo() performance hurt by not being inlined #56493

rickbrew opened this issue Jul 28, 2021 · 18 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@rickbrew
Copy link
Contributor

rickbrew commented Jul 28, 2021

I was profiling some Paint.NET code in WPA the other day and noticed that Double.CompareTo() was taking a non-trivial amount of the execution time:

image

The scenario here in the app is to select an area of the canvas (Rectangle Select, Ellipse Select, Magic Wand, whatever), switch to the Move Selected Pixels too, and do some transforms on it (rotate, scale, etc.).

The selection is represented by a poly-polygon (IReadOnlyList<Point2Double[]>) which is then scan-converted into an alpha mask (bitmap). Scan-conversion is the standard algorithm for doing this and involves lots of sorting, at one point by Y values and then repeatedly by X values. Double.CompareTo() is used quite a bit here, in other words. In the screenshot, you can see that the most time is spent in 1) filling an alpha mask bitmap from the list of scans/rectangles from the scan-conversion code (MaskFromScansRenderer) (this code is vectorized quite a bit now, lots of buffer filling), 2) rendering the transformed pixels onto the current layer's bitmap (TransformedNearestNeighborContentRenderer), 3) some other rendering/compositing and hit-testing (MoveToolContentRenderer), then 4) System.Double::CompareTo() (!!!), and then most of the rest is the actual scan-conversion, sorting, and some more buffer copying.

To investigate further, I crafted up a toy benchmark in LINQPad where I timed how long it took to sort an array of 10 million randomly generated doubles (10 iterations per benchmark). All the usual perf tricks were employed, including pointers to elide bounds checks, and an optimized copy of Sort<T, TList, TComparer>() where everything is structs, interfaces, and tagged with AggressiveInlining. For the first benchmark, the stock Double.CompareTo() is used, and for the second benchmark a copy of it with [MethodImpl(MethodImplOptions.AggressiveInlining)] is employed:

My results:

image

(you can ignore the CompareNoNaN result, it was just for fun to see what happens if I remove the branches that test for NaN)

More than twice as fast -- indicating that the overhead of not inlining Double.CompareTo() is substantial.

My hypothesis here is that adding [MethodImpl(MethodImplOptions.AggressiveInlining)] to Double.CompareTo() (and probably Single.CompareTo()) would be an easy and impactful win. I suspect that these methods aren't used extensively throughout the runtime or other codebases, so the codegen size impact should be small, and where they are used the performance gain would be worth it.

I'm happy to craft up a proper benchmark with real data that others can inspect. I'm submitting this issue so I stop procrastinating on it :)

cc @EgorBo, @tannergooding , who were part of the conversation about this on Discord #lowlevel

@rickbrew rickbrew added the tenet-performance Performance related issue label Jul 28, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jul 28, 2021
@AraHaan
Copy link
Member

AraHaan commented Jul 28, 2021

@rickbrew what is the perf of adding inlining to Single.CompareTo?

@rickbrew
Copy link
Contributor Author

@rickbrew what is the perf of adding inlining to Single.CompareTo?

For the sorting benchmark the results were near identical. Same amount of time, same amount of % improvement.

@EgorBo
Copy link
Member

EgorBo commented Jul 28, 2021

A few notes:

  1. StandardOptimizationData.mibc profile that we ship seems to be missing a lot of FP-related methods, e.g. I couldn't find info for any of the Double API, we need some training scenario for those (e.g. WPF?)
  2. Double.CompareTo is inlineable in DynamicPGO mode in hot call-sites:
    image

I personally don't mind just adding AggressiveInlining for it, Double is often involved in intensive calculations. However, again, we can do it via StandardOptimizationData.mibc

@danmoseley
Copy link
Member

Maybe Paint.NET should be a training scenario somehow?

@rickbrew
Copy link
Contributor Author

We can certainly arrange that, @richlander and I already have something in place to enable that sort of thing

@danmoseley
Copy link
Member

In the short term, since that can't happen immediately, would it make sense to mark aggressive inlining (temporarily?) on Double and Short?

@rickbrew
Copy link
Contributor Author

In the short term, since that can't happen immediately, would it make sense to mark aggressive inlining (temporarily?) on Double and Short?

Sure, here's a PR for review etc. #56501

@AraHaan
Copy link
Member

AraHaan commented Jul 28, 2021

I feel like the benchmarks you made should be added to it as well (if possible) so that way they can be used in the future for future comparison.

@danmoseley
Copy link
Member

If you mean microbenchmarks (like the ones we have in dotnet/performance) it's hard to use them to measure any "benefit" from aggressive inlining. It almost inevitably makes microbenchmarks faster, although in real apps it is not necessarily a benefit.

It would be worth checking that we have good enough coverage of Compare() there, but only to protect the implementation itself.
https:/dotnet/performance/tree/main/src/benchmarks/micro

@richlander
Copy link
Member

We rebuilt the whole PGO system in .NET 6. I have been thinking about how we can take advantage of real apps going forward. Clearly, Paint.NET being on .NET 5+ is a big win for that.

In the future, I'd like to establish a model where we can change PGO data as part of a servicing update. We'd need a compelling and coherent plan for that (which I don't have), but that's the direction I have in mind.

@davidwrighton

@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 29, 2021
@davidwrighton
Copy link
Member

I believe we actually already have the means for updating PGO data in a servicing update, but we don't actually have any plans to do so.

@richlander
Copy link
Member

Cool. I assumed we had no plans for that. It's something that we should consider and discuss, particularly for a release that is supported for three years.

@JulieLeeMSFT
Copy link
Member

@rickbrew I am trying to understand when the milestone should be for this issue. How critical is it to include this in .NET6? Are you planning to complete this in .NET6?

@danmoseley
Copy link
Member

@JulieLeeMSFT Seems to me we just need to merge #56501 for .NET 6.0. Longer term we should increase our PGO test app matrix.

@rickbrew
Copy link
Contributor Author

rickbrew commented Aug 6, 2021

@rickbrew I am trying to understand when the milestone should be for this issue. How critical is it to include this in .NET6? Are you planning to complete this in .NET6?

This isn't critical for me -- I made my own InlinedCompareTo method that I'm using for now inside my app.

@ghost ghost added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs author feedback labels Aug 6, 2021
@JulieLeeMSFT JulieLeeMSFT removed untriaged New issue has not been triaged by the area owner needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Aug 6, 2021
@JulieLeeMSFT JulieLeeMSFT added this to the 6.0.0 milestone Aug 6, 2021
@JulieLeeMSFT JulieLeeMSFT modified the milestones: 6.0.0, 7.0.0 Aug 6, 2021
@JulieLeeMSFT
Copy link
Member

This isn't critical for me -- I made my own InlinedCompareTo method that I'm using for now inside my app.

Thanks @rickbrew. I set the milestone to .NET 7.
@tannergooding I assigned this issue to you to review the PR and merge it.

tannergooding pushed a commit that referenced this issue Aug 20, 2021
…56501)

* Add AggressiveInlining to Double.CompareTo() and Single.CompareTo()

As per discussion at #56493

* Use split inlining

* Revert split inlining
@EgorBo
Copy link
Member

EgorBo commented Nov 3, 2021

Closed by #56501

@EgorBo EgorBo closed this as completed Nov 3, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Dec 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

9 participants