Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the Tap#formula_files and Tap#cask_files methods #18165

Merged

Conversation

apainintheneck
Copy link
Contributor

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes? Here's an example.
  • Have you successfully run brew style with your changes locally?
  • Have you successfully run brew typecheck with your changes locally?
  • Have you successfully run brew tests with your changes locally?

These are slow because of some Pathname related reasons so it's been changed up to use Dir.glob instead and that makes a large difference. This is an alternative approach to #18163 that is a little bit faster.

$ hyperfine --parameter-list branch master,speed-up-loading-cask-and-formula-file-names,speed-up-loading-cask-and-formula-file-names_v2 --warmup 5 --setup 'git switch {branch}' 'brew cat gimp'
Benchmark 1: brew cat gimp (branch = master)
  Time (mean ± σ):      2.150 s ±  0.014 s    [User: 1.369 s, System: 0.745 s]
  Range (min … max):    2.126 s …  2.174 s    10 runs

Benchmark 2: brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names)
  Time (mean ± σ):      1.814 s ±  0.012 s    [User: 1.068 s, System: 0.711 s]
  Range (min … max):    1.797 s …  1.840 s    10 runs

Benchmark 3: brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names_v2)
  Time (mean ± σ):      1.489 s ±  0.009 s    [User: 0.905 s, System: 0.550 s]
  Range (min … max):    1.478 s …  1.503 s    10 runs

Summary
  brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names_v2) ran
    1.22 ± 0.01 times faster than brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names)
    1.44 ± 0.01 times faster than brew cat gimp (branch = master)

These are slow because of some Pathname related reasons so it's been
changed up to use `Dir.glob` instead and that makes a large difference.

```
$ hyperfine --parameter-list branch master,speed-up-loading-cask-and-formula-file-names,speed-up-loading-cask-and-formula-file-names_v2 --warmup 5 --setup 'git switch {branch}' 'brew cat gimp'
Benchmark 1: brew cat gimp (branch = master)
  Time (mean ± σ):      2.150 s ±  0.014 s    [User: 1.369 s, System: 0.745 s]
  Range (min … max):    2.126 s …  2.174 s    10 runs

Benchmark 2: brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names)
  Time (mean ± σ):      1.814 s ±  0.012 s    [User: 1.068 s, System: 0.711 s]
  Range (min … max):    1.797 s …  1.840 s    10 runs

Benchmark 3: brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names_v2)
  Time (mean ± σ):      1.489 s ±  0.009 s    [User: 0.905 s, System: 0.550 s]
  Range (min … max):    1.478 s …  1.503 s    10 runs

Summary
  brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names_v2) ran
    1.22 ± 0.01 times faster than brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names)
    1.44 ± 0.01 times faster than brew cat gimp (branch = master)
```
Copy link
Member

@Bo98 Bo98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, didn't expect glob to be that much faster. Maybe we need to come up with some alternative to Pathname at this point.

@apainintheneck
Copy link
Contributor Author

apainintheneck commented Aug 26, 2024

Wow, didn't expect glob to be that much faster. Maybe we need to come up with some alternative to Pathname at this point.

As you already know, doing a bunch of Pathname operations in a loop can be slow. We don't need to do that as often anymore because of the API which is good so I'm not sure it'd really worth it to create our own wrapper class. This is just a really bad example. It does make me wonder how some of the methods on the Pathname class are implemented though since it they seem to be so much slower than the equivalent File methods at times.

Edit: It looks like most of Pathname is implemented in C which makes me wonder even more. Could it just be that we have a lot of Pathname extensions that are slowing things down a lot?

@apainintheneck apainintheneck merged commit 0c3104a into master Aug 26, 2024
24 checks passed
@apainintheneck apainintheneck deleted the speed-up-loading-cask-and-formula-file-names_v2 branch August 26, 2024 00:53
@Bo98
Copy link
Member

Bo98 commented Aug 26, 2024

Edit: It looks like most of Pathname is implemented in C which makes me wonder even more. Could it just be that we have a lot of Pathname extensions that are slowing things down a lot?

It's not entirely C - most/all of the slow bits are in Ruby, in part from the many regexes used. Our extensions have an impact on extname but unlikely to affect much else. You'll see similar performance in vanilla Ruby.

@apainintheneck
Copy link
Contributor Author

You're probably right. I did see our old friend Pathname#relative_path_from, which is notoriously slow, is implemented in Ruby. Interestingly enough Pathname#glob is implemented in C but I think it might call some methods implemented in Ruby.

@Bo98
Copy link
Member

Bo98 commented Aug 26, 2024

Yeah Pathname#glob (instance method) calls Pathname#+ which is slow.

Pathname.glob (class method) is different and may be roughtly equivelent to what we have now.

Copy link
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @apainintheneck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants