-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove usage of [AggressiveOptimization] #58209
Conversation
Tagging subscribers to this area: @eiriktsarpalis, @layomia Issue DetailsRevert the 4 usages of [AggressiveOptimization]. The 2 in The other 2 in Removing [AO] also has the benefit of supporting a "DynamicPGO" explicit mode that offers "best perf but slow start". See #57327 (comment) I don't believe the changes to remove [AO] meet the bar to port to V6 at this time, since R2R performance should be the same in the vast majority because a "DynamicPGO" mode would need to be turned on explicitly.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to backport to 6.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes on your comments above:
- AO prevents methods from being compiled in R2R. It also causes methods to bypass tiering and always be jitted with optimization. We generally advocate using it only if running an unoptimized or R2R version of the method is unacceptable. Because AO methods bypass tiering, we cannot gather any PGO data for these methods (neither static or dynamic).
- By default, methods with loops will also bypass tiering and are always jitted with optimization. But this default behavior is overridden when gathering static PGO data. So it's possible once you remove AO from these loop methods and we get static PGO data for them, the non-AO versions will be faster than the AO versions. You should also see slightly faster startup behavior because of less jitting.
- Removing AO from the non-loop methods should likewise enable static PGO data collection and potentially better perf once the static PGO collection process catches up. And also will provide a small startup benefit.
- Generally speaking it is rare to be benchmarking R2R code; typically this will only happen if you use some custom benchmark harness or disable tiered compilation. If you benchmark the R2R code, it is typically 20% or so slower than the jitted counterpart.
/backport to release/6.0 |
Started backporting to release/6.0: https:/dotnet/runtime/actions/runs/1174017757 |
@AndyAyersMS, do you know why all of these are AggressiveOptimization? |
My 5 cents: From my understanding there two major excuses for
|
I don't, no. Probably worth reviewing. IIRC we mostly wanted to use AO to eliminate box allocation from the async machinery, since we otherwise could see 100MB's of boxing from Tier0 code before things got rejitted. |
Revert the 4 usages of [AggressiveOptimization].
The 2 in
ObjectDefaultConverter
that were reverted have loops which cause the jitter to re-jit them anyways, so the use of [AggressiveOptimization] really shouldn't do anything.The other 2 in
JsonConverter<T>
that remain do seem get up to a 3% benefit but only some of the time, and only measured against a local release build that is a non-R2R (non-crossgen2'd) STJ.dll image. In theory there shouldn't be a perf improvement since a R2R image automatically did the equivalent of [AggressiveOptimization] for all methods. When obtaining the fastest time from several runs, the benchmark (which measures serializer overhead) is faster with [AO], however the variability of these benchmarks overlap and are difficult to measure consistently. Perhaps with [AO] there is sometimes closer\faster memory locality with other methods.Removing [AO] also has the benefit of supporting a "DynamicPGO" explicit mode that offers "best perf but slow start".
See #57327 (comment)
I don't believe the changes to remove [AO] meet the bar to port to V6 at this time, since R2R performance should be the same in the vast majority because a "DynamicPGO" mode would need to be turned on explicitly.