Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add WarpBatch - array process warp with single callback #651

Merged
merged 1 commit into from
Dec 17, 2023

Conversation

wrongbad
Copy link
Contributor

@wrongbad wrongbad commented Dec 14, 2023

I introduce WarpBatch for both Manifold and CrossSection.

The new callback signature is void(manifold::VecView<glm::vec3>) to allow for arbitrary contiguous storage. This moves the per-vertex tight-loop fully into the callback, where math vectorizations can now apply.

This is especially helpful for python users where scalar math is vastly slower, but array math can use compiled libs like numpy.

Copy link

codecov bot commented Dec 14, 2023

Codecov Report

Attention: 8 lines in your changes are missing coverage. Please review.

Comparison is base (eba866d) 91.61% compared to head (fa6d6d7) 91.66%.
Report is 2 commits behind head on master.

Files Patch % Lines
src/utilities/include/vec_view.h 71.42% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #651      +/-   ##
==========================================
+ Coverage   91.61%   91.66%   +0.04%     
==========================================
  Files          36       37       +1     
  Lines        4724     4737      +13     
==========================================
+ Hits         4328     4342      +14     
+ Misses        396      395       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wrongbad wrongbad force-pushed the warp_batch branch 3 times, most recently from 77b5cd8 to 36636c1 Compare December 14, 2023 01:11
@elalish
Copy link
Owner

elalish commented Dec 14, 2023

This looks good to me - would you mind adding a python example for this? Perhaps convert our Heart or Torus Knot examples? It would also be a good excuse to see a performance comparison between the two warp approaches, which it would be nice to record here.

I'm also curious if we should do the same thing for our JS bindings. Though perhaps WASM has less overhead in that tight loop, but I'm not sure. @pca006132 this also seems related to some discussions we've had around SDF. Does this seem like a useful pattern there as well?

@pca006132
Copy link
Collaborator

SDF: #598

Yes this is also a useful pattern for SDF, but I think we have more things to consider for SDF in the previous PR.
This PR looks good to me as is.

@wrongbad wrongbad force-pushed the warp_batch branch 2 times, most recently from 2e07a18 to fb3a13d Compare December 14, 2023 14:21
@wrongbad
Copy link
Contributor Author

I added torus_knot.py showing fully vectorized warp math. We're gonna convert you to a python lover :P

I'm also curious if we should do the same thing for our JS bindings

If you have the time, sketch it up and run some timings on torus knot with high segment numbers. I'm sure the JIT helps a lot but even fully native code is punished by high numbers of function-pointer calls.

@wrongbad
Copy link
Contributor Author

Is there a helper script somewhere to auto-format everything? I've been reverse-engineering the commands to run from the CI outputs, and it's a bit of a pain.

@pca006132
Copy link
Collaborator

Similar thing can help for js, the problem is that JIT is not good at optimizing across language boundaries.

We don't have a script right now, but it is just using the black formatter for python files.

@wrongbad
Copy link
Contributor Author

On my laptop, using the exact same numpy warp code but running 1 point at a time (Manifold.warp) it takes ~800ms. Running warp_batch takes about 40-50ms.

@pca006132
Copy link
Collaborator

I think black is asking you to reformat bindings/python/examples/torus_knot.py

@wrongbad
Copy link
Contributor Author

Yeah, I think I have it figured out now. I just mean having to download black and learn how to use it based on failure reports. It was worse with clang-format - I figured out I needed to downgrade versions to get consistent with CI. A script would save some effort, especially for new people getting set up.

@pca006132
Copy link
Collaborator

Interesting, I thought clang-format should have consistent settings across versions. Agreed, we should pin the detailed format settings and provide a script to format everything. Can be addressed in another PR.

@wrongbad wrongbad marked this pull request as draft December 15, 2023 19:41
@wrongbad
Copy link
Contributor Author

I'd like to change the function signature to void(VecView<glm::vec3>) - it supports range-for-loops, and also plays better with python bindings auto-casting to/from numpy.

@wrongbad wrongbad force-pushed the warp_batch branch 4 times, most recently from 0f219ce to 69a4acf Compare December 15, 2023 20:11
@wrongbad wrongbad marked this pull request as ready for review December 15, 2023 20:11
@wrongbad wrongbad force-pushed the warp_batch branch 2 times, most recently from 514fb72 to a25a28d Compare December 15, 2023 22:12
Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some nits.

src/cross_section/src/cross_section.cpp Outdated Show resolved Hide resolved
CrossSection CrossSection::WarpBatch(
std::function<void(VecView<glm::vec2>)> warpFunc) const {
std::vector<glm::vec2> tmp_verts;
C2::PathsD paths = GetPaths()->paths_; // deep copy
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this line the deep copy, or the for loops following?

Copy link
Contributor Author

@wrongbad wrongbad Dec 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copy it up front, then use it to update in-place after the warp func. Mainly because it allocates all the right vector sizes in one simple line of code. Are you concerned with the wasted data-copy doing it this way? I could optimize it if you think the complexity is worth it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, I guess Emmett is just not sure about where the deep copy comment belongs to (the single line or including the loop after it).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it kind of looked like it was just a one-line deep copy followed by a loop deep copy - I'm still not quite sure why.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I think I understand now. It's all good.

std::vector<glm::vec2> tmp_verts;
C2::PathsD paths = GetPaths()->paths_; // deep copy
for (C2::PathD& path : paths) {
for (C2::PointD& p : path) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our style is to use const anywhere is possibly can be (mostly for readability). Mind taking a pass through the PR to constify it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed these and don't see any more, but let me know if I missed any.

src/utilities/include/vec_view.h Outdated Show resolved Hide resolved
@elalish
Copy link
Owner

elalish commented Dec 16, 2023

Regarding formatting, we generally use VSCode and their related extensions. I think that keeps things fairly synced, probably since MS also owns Github and Actions. What dev environment are you using?

@pca006132
Copy link
Collaborator

No, vscode extension only calls the clang-format in your environment, so microsoft does not control what formatter you use :)

@elalish
Copy link
Owner

elalish commented Dec 16, 2023

I'm on clang-format v17 on my mac and it's working fine. What was the problem version you had, @wrongbad?

@wrongbad
Copy link
Contributor Author

wrongbad commented Dec 16, 2023

I was also using v17 on mac and it was formatting this line in a way that didn't pass the CI, but v11 was working for me.
v11, v12 do this

  nb::capsule mem_mgr(buffer, [](void *p) noexcept { delete[](T *) p; });

v17 does this

nb::capsule mem_mgr(buffer, [](void *p) noexcept { delete[] (T *)p; });

@pca006132
Copy link
Collaborator

Yes I also saw the same problem with clang-format 15. I think it changes when upgrading from 14 to 15.

Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Minor nit - for ease of review I'd appreciate it if you refrain from force-pushing. I know lots of people use rebase to avoid merge commits, but we do a squash and merge anyway so it makes no difference. If we stick to regular merging, then we can more easily see what changed since the last review.

@elalish elalish merged commit bc0bded into elalish:master Dec 17, 2023
18 checks passed
@elalish elalish mentioned this pull request Jan 7, 2024
cartesian-theatrics pushed a commit to SovereignShop/manifold that referenced this pull request Mar 11, 2024
Add WarpBatch - array process warp with single callback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants