Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non linear runtime increasment for expand_crate pass while compiling diesel with 128-column-tables #81262

Open
weiznich opened this issue Jan 22, 2021 · 8 comments
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@weiznich
Copy link
Contributor

weiznich commented Jan 22, 2021

This is a followup issue to rust-lang/rustc-perf#807, where I've suggested to add diesel with the 128-column-tables enabled to the rustc-perf test-suite as this is taking quite a lot of time to compile. As part of that request we discovered that the expand_crate pass scales non-linear with the number of codelines expanded through a macro. (For diesel that would likely be this macro call which expands the macro above for all tuples sizes between 1 and the maximal number specified (16, 32, 64 or 128) via this macro

@rustbot modify labels: +I-slow

@weiznich weiznich added the C-bug Category: This is a bug. label Jan 22, 2021
@rustbot rustbot added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Jan 22, 2021
@tgnottingham
Copy link
Contributor

tgnottingham commented Jan 22, 2021

Are you able to reproduce this on the nightly compiler? I'm wondering if #78317 addressed this problem.

Edit: nevermind. It seems that only addressed the crate_inherent_impls_overlap_check pass according to this comment.

@estebank estebank added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 25, 2021
@est31
Copy link
Member

est31 commented Jan 28, 2021

It might be helpful to have a minimal example to reproduce the issue.

@weiznich
Copy link
Contributor Author

@est31 That would be something like https:/weiznich/minimal_example_for_rust_81262

@est31
Copy link
Member

est31 commented Jan 28, 2021

@weiznich thanks, that looks great already!

@est31
Copy link
Member

est31 commented Apr 28, 2022

There have been a bunch of performance improvements for macro expansion, so I've decided to check whether the issue still exists, first with a nightly adjacent to the filing of this bug, and then a recent one.

I used commands like time rustc +nightly-2022-01-28 --cfg 'feature="32-column-tables"' --cfg 'feature="64-column-tables"' --cfg 'feature="128-column-tables"' src/main.rs and took the "real" value.

Run nightly-2022-01-28 nightly-2022-04-28
default 0.27s 0.26s
32 0.59s 0.49s
32,64 4.44s 3.05s
32,64,128 59.17s 36.26s

Apparently, speed did improve! But the strong nonlinearity still remains. Note that I'm testing end-to-end compilation speeds here, I don't know how much of a component expanding is.

fundamentally macro matching is quadratic with the number of rules, but an increase of macro rules seems not to be happening here. The macro is only used to invoke something different a lot of times, so all that increases is the number of times that a macro gets executed.

cc @nnethercote who is way more familiar with macro expansion code than me.

@est31
Copy link
Member

est31 commented Apr 28, 2022

FTR it's been replaced by a proc macro but apparently speed wise it's not any better.

@nnethercote
Copy link
Contributor

fundamentally macro matching is quadratic with the number of rules, but an increase of macro rules seems not to be happening here. The macro is only used to invoke something different a lot of times, so all that increases is the number of times that a macro gets executed.

Ignoring the internals of macro matching for a moment: __diesel_for_each_tuple is quadratic in the number of columns. Look at tuples.rs. The following list shows how many lines of the file account for the different column configurations:

  • 16 columns: 174 lines
  • 16,32 columns: 602 lines
  • 16,32,64 columns: 2222 lines
  • 16,32,64,128 columns: 8529 lines

The growth in lines is slower than the growth in compile times, so there may be non-linear stuff happening internally. This might be in expand_crate, or might be somewhere else, such as trait handling. It would be useful to know which pass is looking more non-linear than the line growth.

@weiznich
Copy link
Contributor Author

My guess is that some non-linearity is caused by this macro. That one is internally called by __diesel_for_each_tuple! and has some non-linear behaviour by itself. Simplifying this macro would be possible by restructuring the where clauses there, but that leads to a non-linearity in the trait solver (see #75542 for details).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants