Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the Tap#formula_files method #18163

Closed

Conversation

apainintheneck
Copy link
Contributor

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes? Here's an example.
  • Have you successfully run brew style with your changes locally?
  • Have you successfully run brew typecheck with your changes locally?
  • Have you successfully run brew tests with your changes locally?

We already know that files are going to be formula files if they are ruby files since we pull them out of the formula directory so that's all we really need to check for here. I was surprised that Pathname#extname was noticeably slower than turning it into a string and checking the end of it.

$ hyperfine --parameter-list branch master,speed-up-loading-cask-and-formula-file-names --warmup 5 --setup 'git switch {branch}' 'brew cat gimp'
Benchmark 1: brew cat gimp (branch = master)
  Time (mean ± σ):      2.195 s ±  0.031 s    [User: 1.395 s, System: 0.748 s]
  Range (min … max):    2.163 s …  2.273 s    10 runs

Benchmark 2: brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names)
  Time (mean ± σ):      1.845 s ±  0.034 s    [User: 1.073 s, System: 0.718 s]
  Range (min … max):    1.814 s …  1.920 s    10 runs

Summary
  brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names) ran
    1.19 ± 0.03 times faster than brew cat gimp (branch = master)

This also makes it a bit simpler for me to resurrect #15489 since it means there is now one less reference to the Tap#formula_file? method.

@Bo98
Copy link
Member

Bo98 commented Aug 25, 2024

I was surprised that Pathname#extname was noticeably slower than turning it into a string and checking the end of it.

Pathname is pretty bad a lot of the time but in this particular case it may be from

bottle_ext, = HOMEBREW_BOTTLES_EXTNAME_REGEX.match(basename).to_a
return bottle_ext if bottle_ext
archive_ext = basename[/(\.(tar|cpio|pax)\.(gz|bz2|lz|xz|zst|Z))\Z/, 1]
return archive_ext if archive_ext
# Don't treat version numbers as extname.
return "" if basename.match?(/\b\d+\.\d+[^.]*\Z/) && !basename.end_with?(".7z")

You can test by comparing it to File.extname.

.to_s.ends_with? is technically incorrect if you had a dotfile named .rb, though maybe we don't care too much about that.

We already know that the directory and files are going to be formula files if they are ruby files
so that's all we really need to check for here. The other change is to use `File.extname` instead
of `Pathname#extname` which we assume is faster because it doesn't use the monkey-patched version
that handles some additional extension types with multiple periods.

```
$ hyperfine --parameter-list branch master,speed-up-loading-cask-and-formula-file-names --warmup 5 --setup 'git switch {branch}' 'brew cat gimp'
Benchmark 1: brew cat gimp (branch = master)
  Time (mean ± σ):      2.109 s ±  0.010 s    [User: 1.349 s, System: 0.726 s]
  Range (min … max):    2.097 s …  2.128 s    10 runs

Benchmark 2: brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names)
  Time (mean ± σ):      1.795 s ±  0.011 s    [User: 1.061 s, System: 0.702 s]
  Range (min … max):    1.777 s …  1.813 s    10 runs

Summary
  brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names) ran
    1.17 ± 0.01 times faster than brew cat gimp (branch = master)
```
@apainintheneck apainintheneck force-pushed the speed-up-loading-cask-and-formula-file-names branch from b520eb9 to 8664972 Compare August 25, 2024 19:50
@apainintheneck
Copy link
Contributor Author

Good points all around. I'd forgotten that it'd potentially match a dotfile here when using String#ends_with? and I didn't know that we had a monkey-patched version of Pathname#extname. Switching to using File.extname provides a similar speed up though so this should be fine now.

$ hyperfine --parameter-list branch master,speed-up-loading-cask-and-formula-file-names --warmup 5 --setup 'git switch {branch}' 'brew cat gimp'
Benchmark 1: brew cat gimp (branch = master)
  Time (mean ± σ):      2.109 s ±  0.010 s    [User: 1.349 s, System: 0.726 s]
  Range (min … max):    2.097 s …  2.128 s    10 runs

Benchmark 2: brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names)
  Time (mean ± σ):      1.795 s ±  0.011 s    [User: 1.061 s, System: 0.702 s]
  Range (min … max):    1.777 s …  1.813 s    10 runs

Summary
  brew cat gimp (branch = speed-up-loading-cask-and-formula-file-names) ran
    1.17 ± 0.01 times faster than brew cat gimp (branch = master)

@apainintheneck apainintheneck marked this pull request as draft August 25, 2024 22:15
@apainintheneck
Copy link
Contributor Author

Converting to draft because I think I found an even better approach.

@apainintheneck
Copy link
Contributor Author

Closing in favor of #18165

@Bo98 Bo98 deleted the speed-up-loading-cask-and-formula-file-names branch August 26, 2024 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants