Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sph projections, slices, gridding: backend and off-axis #4939

Merged
merged 64 commits into from
Sep 16, 2024

Conversation

nastasha-w
Copy link
Contributor

@nastasha-w nastasha-w commented Jul 4, 2024

Fixes issues in the SPH projection backend and adds off-axis projection for SPH particles.

PR Summary

The current version removes the addition of small SPH particles to column densities along lines of sight those particles do not intersect. Additionally, this version supports off-axis projections for SPH datasets.
Similar issues in slice plots and grid value interpolation (scatter method) for SPH datasets are still TODO. Tests are also currently in a separate script sph_pixelization_tests_2024-07-03.py.zip, so I can run the tests on the main and issue branches.

This addresses some of the points in issue #4788 ; it should eventually address all of them.

PR Checklist

  • New features are documented, with docstrings and narrative docs
  • Adds a test for any bugs fixed. Adds tests for new features.
  • Fix issues in axis-aligned slice plots for SPH datasets
  • Add off-axis slice plots for SPH datasets
  • Fix issues in 'scatter'-method gridding of SPH data
  • Merge up-to-date main branch into this one
  • Add tests to the yt codebase.

@nastasha-w
Copy link
Contributor Author

Some nice test images for the off-axis projection: the gizmo_mhd_mwdisk projected along the x- and z-axes with axis-aligned projections:
sph_projection_test_lineofsight_x.pdf
sph_projection_test_lineofsight_z.pdf
and that same galaxy projected along the (marginally-off) x and z axes, going from x to z in nine steps. Here, unlike for the axis-aligned plots, there is an option to set the image y-direction (north_vector), which in this case is always the simulation y-axis:
sph_projection_test_lineofsight_step0of9.pdf
sph_projection_test_lineofsight_step1of9.pdf
sph_projection_test_lineofsight_step2of9.pdf
sph_projection_test_lineofsight_step3of9.pdf
sph_projection_test_lineofsight_step4of9.pdf
sph_projection_test_lineofsight_step5of9.pdf
sph_projection_test_lineofsight_step6of9.pdf
sph_projection_test_lineofsight_step7of9.pdf
sph_projection_test_lineofsight_step8of9.pdf

I haven't added these as formal answer tests, it's just a visual verification that the outcomes make sense on real data.

@neutrinoceros
Copy link
Member

@nastasha-w looks like there are a couple defects at the moment. If you need a hand to get CI green let me know !

@neutrinoceros neutrinoceros added enhancement Making something better index: particle labels Jul 4, 2024
@nastasha-w
Copy link
Contributor Author

Thanks @neutrinoceros! I think the main issue now is that I branched this off main like 6 months ago, so I'll try merging the lastest version in first. I do also think one or a few answer tests from the previous version should fail: these use the old SPH axis-aligned projections.

@neutrinoceros
Copy link
Member

I think the main issue now is that I branched this off main like 6 months ago

Nothing a rebase can't fix. It shouldn't be risky actually since you don't have any conflicts.

@nastasha-w
Copy link
Contributor Author

pre-commit.ci autofix

2 similar comments
@nastasha-w
Copy link
Contributor Author

pre-commit.ci autofix

@nastasha-w
Copy link
Contributor Author

pre-commit.ci autofix

@nastasha-w
Copy link
Contributor Author

@neutrinoceros , could you take a look at workflow approval for this PR? I've been submitting a lot of commits recently and I suspect it stopped running everything automatically for that reason. Basically, I've been wrestling with the type-checker a lot today and yesterday.
I think I've got the main things working now, but I was also getting some failures on new tests I added due to divide-by-zero warnings. (The zero division occurs in results I'm calculating to compare to the main yt calculations, and I explicitly check for those situations before making the comparison.)

@neutrinoceros
Copy link
Member

Plot twist: it never ran automatically, it's always me manually approving. 😅
You need to have at least one commit already in the history for CI to run automatically. Maybe you could issue another much smaller PR that we can merge quickly ?

@neutrinoceros
Copy link
Member

I took the liberty of pushing to your branch to resolve 2 small bugs that caused a few hundreds tests to fail. I'm hoping to get a much clearer picture of what's still broken after that.

@neutrinoceros neutrinoceros force-pushed the sph_proj_backend branch 2 times, most recently from ccd3beb to 9fa337d Compare July 19, 2024 09:00
@neutrinoceros
Copy link
Member

On the github actions side we're left with 81 failures that all seem to have one root cause: a divide-by-zero error. Let me know if you need a hand there too, otherwise I'll let you fix it.

@nastasha-w
Copy link
Contributor Author

@neutrinoceros Thanks for the help! I think I've got those divide-by-zero errors now. I'll try to follow the instructions on the website for updating the image tests.

@nastasha-w
Copy link
Contributor Author

pre-commit.ci autofix

@jzuhone
Copy link
Contributor

jzuhone commented Jul 19, 2024

@nastasha-w just a note that I have been keeping track of this and I plan on reviewing it once this has been sorted out.

@nastasha-w
Copy link
Contributor Author

@jzuhone Thanks! It feels like I'm getting close, but then it's felt like that for a few days already

@nastasha-w
Copy link
Contributor Author

pre-commit.ci autofix

@nastasha-w
Copy link
Contributor Author

@neutrinoceros could you help me with the image tests? Under the full nose test suite, the only failing tests now seem to be comparisons of previously generated images of SPH projections to the figures currently produced. Changes are expected there because I changed that backend. I have looked into the instructions in the docs for updating these tests here, but these don't seem to work for me. I have downloaded the image overview from the “Build and Tests” job summary page, but it seems like those don't cover the "full testsuite (nose)" tests where I'm getting the errors. This section here suggests updating the answer_version attribute of the relevant test class (in this case PixelizedParticleProjectionValuesTest, if I'm not mistaken) should work, but it isn't clear to me what the current test version is or how to increase it, since that attribute is only mentioned in the codebase in a few places: the docs I mentioned, a method in conftest.py getting stored and new file names, and one class (TestGDF) that seems to define this attribute explicitly. Could you let me know how to update the baseline for these image tests?
Also, I don't know why the docs test is failing; it passed the last test suite and I didn't change anything about the documentation, or any docstrings, since then. I am seeing a "Jenkins is stupid" file though; is there some issue in the testing process here, or do I need to do some digging?

@neutrinoceros
Copy link
Member

I have looked into the instructions in the docs for updating these tests here, but these don't seem to work for me.

Sorry if this is confusing: these are the instruction for updating pytest-based image tests; however, Jenkins is still running nosetest for tests that require real (heavy) datasets.

This section here suggests updating the answer_version attribute of the relevant test class

that's even worse... I believe this part of our documentation was never actually in sync with our practice (it was introduced as part of a substantial but incomplete effort to migrate our test suite to pytest). Disregard it entirely.

So the actual way we update answers with nose is by bumping version numbers in tests/tests.yaml, which triggers a regeneration of baselines on Jenkins.
However, it is customary to not do it until a reviewer explicitly approve of it, so the next logical step is for someone to have a look and check that all changes are both expected and desired. I have only looked at a couple myself and I'll admit that the new images are not clearly better to my eye, and I'm not even sure that I can discard the possibility of a bug.
For instance here's one:

before (as of yt 4.3)

before

after (as of this branch)

after

can you explain what's going on ?

@neutrinoceros
Copy link
Member

And about docs builds:

Also, I don't know why the docs test is failing; it passed the last test suite and I didn't change anything about the documentation, or any docstrings, since then. I am seeing a "Jenkins is stupid" file though; is there some issue in the testing process here, or do I need to do some digging?

Looking at the console output it is also not clear to me that the error is actually related to your PR. Something to be aware of is that we only run docs builds on PRs that actively modify or introduce documentation, so it's possible that sometimes a bug slip into a PR that doesn't do that but still breaks documentation builds. This seems unlikely here given that the only PR that was merged between your last succesful run and now is #4945 and looks even less related.
I suspected that sphinx's new version (7.4.x) might have caused it but looking at logs I see that each time pip resolved to sphinx 7.3.7 so that's not it either (but I'm not excluding some other dependency might have changed in the interval).

@neutrinoceros
Copy link
Member

here's what a systematic diffing shows

35c35
< dask-2024.7.0
---
> dask-2024.7.1
137c137
< pytest-8.2.2
---
> pytest-8.3.1
166c166
< sphinxcontrib-htmlhelp-2.0.5
---
> sphinxcontrib-htmlhelp-2.0.6
169c169
< sphinxcontrib-qthelp-1.0.7
---
> sphinxcontrib-qthelp-1.0.8

maybe we could try pinning

 sphinxcontrib-htmlhelp==2.0.5
 sphinxcontrib-qthelp==1.0.7

and see what happens. In fact, I'll try this now (again, taking the liberty to push to your branch)

@neutrinoceros
Copy link
Member

Looks like it indeed stabilized the builds. I'll be working on a finer and more stable solution in #4952

@neutrinoceros
Copy link
Member

Also see #4950 about the (unrelated) failure seen on Windows

@neutrinoceros
Copy link
Member

neutrinoceros commented Jul 21, 2024

To my surprise, I couldn't reproduce the failure in #4952. Let me try to remove my last commit here.

@nastasha-w
Copy link
Contributor Author

With the test number bump, many of the previously failing tests now pass. However, for arepo frontend tests specifically, the tests fail with error "There is no old answer available.". Does anyone have any idea what the problem could be, or how to fix it?
To be clear, this is just arepo; the owls, tipsy, gadget, and gizmo frontend tests pass after increasing the test number.

@neutrinoceros
Copy link
Member

oh, could this be showing some form of corruption in the database created by not letting the first job finish ?
@Xarthisius ?

@jzuhone
Copy link
Contributor

jzuhone commented Aug 21, 2024

@neutrinoceros that's almost certainly what it is. @Xarthisius needs to fix it

@nastasha-w
Copy link
Contributor Author

The checks passed!!

@jzuhone
Copy link
Contributor

jzuhone commented Aug 22, 2024

@chummels said he was going to have a look at this.

@jzuhone jzuhone self-requested a review August 22, 2024 13:15
@neutrinoceros
Copy link
Member

Let's move it to the 4.4.0 milestone then 🤞🏻

@neutrinoceros neutrinoceros added this to the 4.4.0 milestone Aug 22, 2024
@jzuhone
Copy link
Contributor

jzuhone commented Sep 6, 2024

I'm going to merge this by COB Monday unless someone explicitly jumps in. We have two approvals and it's ready to go.

@chummels
Copy link
Member

I'm looking at this today and will have comments ASAP.

@chummels
Copy link
Member

I'm running tests locally and not seeing any difference in behavior for OffAxisProjections and ProjectionPlots between the dev tip and this PR. I'm using the FIRE m12i z=0 snapshot at a variety of resolutions and for both the ('gas', 'density') and ('gas', 'H_p0_number_density') fields, but getting identical results, so I must be doing something wrong on this side. I would think as I stepped out in resolution to something like width=(500, 'kpc') for the projection, then some of the smaller particles would get dropped as per the description in the PR, but the results look identical. I'm just doing a sanity check on all of this, since the PR is so detailed that it is difficult to review thoroughly by simple code review.

@neutrinoceros
Copy link
Member

@chummels just to get this out of the way: did you make sure to re-compile after switching branches ?

@nastasha-w
Copy link
Contributor Author

@chummels since you mentioned the width, I also wonder whether you're increasing the grid size (number of pixels) in tandem with the image width (i.e., constant width per pixel), or increasing the grid size with constant image width (decreasing width per pixel). The first would not give a resolution effect with either method, and the differences between the methods would depend on which width per pixel you chose (larger -> bigger difference). The second would give the effect I was worried about. (I'm guessing you thought of this, it's just to be sure.)

@chummels
Copy link
Member

@chummels just to get this out of the way: did you make sure to re-compile after switching branches ?

Yes, I update between the branches, then pip install -e . to recompile the code before re-running my script.

@chummels
Copy link
Member

@chummels since you mentioned the width, I also wonder whether you're increasing the grid size (number of pixels) in tandem with the image width (i.e., constant width per pixel), or increasing the grid size with constant image width (decreasing width per pixel). The first would not give a resolution effect with either method, and the differences between the methods would depend on which width per pixel you chose (larger -> bigger difference). The second would give the effect I was worried about. (I'm guessing you thought of this, it's just to be sure.)

I was merely changing the width of the ProjectionPlot and not explicitly changing the number of pixels in the resulting image. It seems to me that by changing the width of the ProjectionPlot and keeping the number of pixels fixed, it would change the physical pixel size. It was my understanding from the PR that this would result in different effects when the pixel size became increasingly large relative to the size of the fluid elements, that small fluid elements would be dropped from each pixel's line integral and there would be a change in image.

@nastasha-w
Copy link
Contributor Author

Yes, changing the width and keeping the number of pixels fixed should lead to resolution-dependent effects. I'm just not sure it would be as easy to see those effects by eye. Are you comparing pixel values directly here?

@nastasha-w
Copy link
Contributor Author

I tried a similar test myself:

projaxes_cam = ["x", "z", (np.sqrt(0.5), 0., np.sqrt(0.5))]
widths_cam = [62.5, 125., 250., 500.] #kpc
qtys_cam = [('gas', 'density'), ('gas', 'H_p0_number_density')]

running on the sph_proj_backend (this PR) and main (2024-09-11) branches, with the gizmo_mhd_mwdisk.

The first difference I found is that the sph_proj_backend would not work with the (np.sqrt(0.5), 0., np.sqrt(0.5)) projection axis; this is expected since off-axis projections were not previously supported for SPH datasets in yt. I therefore limited myself to x- and z-axis projections.
There, I did find the expected differences: as I zoom out to wider views, the difference between the images generated with the two branches increases, especially within the galaxy, where the densities are highest, and the smoothing lengths are therefore smallest.

imgcomp_gizmo_mwdisk_xax_gasdenssph_proj_backend_20240911_main_20240911
imgcomp_gizmo_mwdisk_xax_h1denssph_proj_backend_20240911_main_20240911
imgcomp_gizmo_mwdisk_zax_gasdenssph_proj_backend_20240911_main_20240911
imgcomp_gizmo_mwdisk_zax_h1denssph_proj_backend_20240911_main_20240911

@nastasha-w
Copy link
Contributor Author

I also downloaded the FIRE m12i data from the yt hub (FIRE_M12i_ref11), and got the following images for x and z projections of gas density and H I number density:
imgcomp_fire_m12i_xax_gasdenssph_proj_backend_20240911_main_20240911
imgcomp_fire_m12i_xax_h1denssph_proj_backend_20240911_main_20240911
8-0a4bcc678699)
imgcomp_fire_m12i_zax_gasdenssph_proj_backend_20240911_main_20240911
imgcomp_fire_m12i_zax_h1denssph_proj_backend_20240911_main_20240911
At first glance, there does not appear to be a difference between the top and middle rows, but at least in the two rightmost images in all four of these comparisons (most zoomed out images), there are in fact small differences between the branches at the images centers. These are not or barely discernible in the small png image panes here, but I could see them using the zoom tool in interactive matplotlib plotting.
@chummels Is it possible that those differences in a small area are also present in your tests?

@nastasha-w
Copy link
Contributor Author

Here are those same tests, but only showing the part of the image from the highest zoom level in each panel. The differences are more apparent here, including that the central 'dot' in in main branch image seems to get brighter as the resolution decreases (left to right), while it just gets 'fuzzier' on the sph_proj_backend branch.

imgcomp_fire_m12i_xax_gasdenssph_proj_backend_20240911_main_20240911_samezoom
imgcomp_fire_m12i_xax_h1denssph_proj_backend_20240911_main_20240911_samezoom
imgcomp_fire_m12i_zax_gasdenssph_proj_backend_20240911_main_20240911_samezoom
imgcomp_fire_m12i_zax_h1denssph_proj_backend_20240911_main_20240911_samezoom

@JBorrow
Copy link
Contributor

JBorrow commented Sep 11, 2024

It's naturally harder to see these kind of artifacts when looking in a logarithmic density space. However, the inaccuracies are always there (as shown by Natasha's plots). I ran into this problem when looking at convergence in this paper https://ui.adsabs.harvard.edu/abs/2021arXiv210605281B/abstract (referenced in the other thread). Specifically this figure:

Screenshot 2024-09-11 at 1 54 53 PM

Looking at the two you don't see differences immediately, but the numerical discrepancies are there. They sensitively depend on the ratio between the typical smoothing length and the pixel size, as Natasha's plots show.

@chummels
Copy link
Member

So to follow up on this, I guess my old install of yt was quite old indeed. I removed yt and conda and all dependencies and reinstalled everything from scratch, and now I see that this PR gives the desired behavior. Here are some quick and dirty plots showing that this PR does in fact address the resolution issue with consistent behavior.

FIRE_m12i:

FIRE_comp

AREPO Cluster:

AREPO_comp

Nominally, I think everything is good to go but maybe best to chat in a few hours about all of this before merge?

@nastasha-w
Copy link
Contributor Author

@chummels yeah, I think this would be a good discussion to have, on which methods are desirable if nothing else. I'm glad it worked out, but yeah, those installations of different versions are consistently a pain in the heinie.

@jzuhone
Copy link
Contributor

jzuhone commented Sep 14, 2024

Where did we land with this at the meeting?

@nastasha-w
Copy link
Contributor Author

nastasha-w commented Sep 14, 2024 via email

@chummels chummels merged commit cadddb6 into yt-project:main Sep 16, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Making something better index: particle
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants