Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in rayleigh correction related to Dask version >= 2021.5.1 #137

Closed
adybbroe opened this issue Dec 15, 2021 · 6 comments · Fixed by #138
Closed

Bug in rayleigh correction related to Dask version >= 2021.5.1 #137

adybbroe opened this issue Dec 15, 2021 · 6 comments · Fixed by #138
Assignees
Labels

Comments

@adybbroe
Copy link
Collaborator

Code Sample, a minimal, complete, and verifiable piece of code

Since Dask version 2021.5.1 the unittests fails. The searchsorted function is not supported by Dask anymore.

# Your code here
import numpy as np
from dask.array import from_array

wvl = 444.01239999999996
wvl_coord = np.array([440., 445.], dtype=np.float32)
dawvl_coord = from_array(wvl_coord)

idx = np.searchsorted(wvl_coord, wvl)
daidx = np.searchsorted(dawvl_coord, wvl)

This simple code fails on the last line with the error message:

--> 773     meta = np.searchsorted(a._meta, v._meta)
    774     out = blockwise(
    775         _searchsorted_block,

AttributeError: 'float' object has no attribute '_meta'

Problem description

Expected Output

The unittests should pass!

Actual Result, Traceback if applicable

Versions of Python, package at hand and relevant dependencies

dask                      2021.12.0          pyhd8ed1ab_0    conda-forge
h5py                      3.6.0           nompi_py39h7e08c79_100    conda-forge
hdf5                      1.12.1          nompi_h7f166f4_103    conda-forge
python                    3.9.7           hf930737_3_cpython    conda-forge
numpy                     1.21.4           py39hdbf815f_0    conda-forge

Thank you for reporting an issue !

@adybbroe adybbroe self-assigned this Dec 15, 2021
@adybbroe adybbroe added the bug label Dec 15, 2021
@adybbroe
Copy link
Collaborator Author

adybbroe commented Dec 15, 2021

In rayleigh.py in the get_reflectance method I believe this workaround solve the issue:

        if hasattr(wvl_coord, 'chunks'):
            idx = np.searchsorted(wvl_coord.compute(), wvl)
        else:
            idx = np.searchsorted(wvl_coord, wvl)

At least this makes the unittests pass.

@adybbroe
Copy link
Collaborator Author

Do we follow the above, or try to remove some of the Daskifying of the lut?
It is rather complicated the handling of the mutual support of both dask arrays and and pure numpy arrays in get_reflectance...

@djhoese
Copy link
Member

djhoese commented Dec 15, 2021

@adybbroe Remind me, what is the shape of wvl_coord and how is it generated? I think your workaround is a nice quick fix. I think the short term solution (but longer than your workaround) is to remove the dask-ification of the LUT values if possible. That's something I was playing with in #133, but was also more focused on putting everything into a map_blocks call. I think the long term solution may be to find a way to wrap the entire get_reflectance method (or a helper method that get_reflectance uses) in a decorator that we create that magically does the xarray[dask], xarray[numpy], dask, or numpy conversion and wraps things in a map_blocks call when possible. Then the internal code only needs to deal with numpy arrays. This is something Martin has asked for in pyorbital too after some of my optimizations in satpy.

@adybbroe
Copy link
Collaborator Author

@djhoese Shape is (81,) always atm, it is the wavelengths in nanometers read from the LUT table, from 400 to 800 (covering the visible spectrum) in steps of 5nm

@adybbroe
Copy link
Collaborator Author

adybbroe commented Dec 15, 2021

Sounds right, your strategy, though I would not know how to do that on my own. Perhaps I could do the above quick fix now and we could get @pnuu 's PRs merged and make a release, and then try follow your step 1 for a new release, and reserve step 2 for later?

@adybbroe
Copy link
Collaborator Author

Actually, I realized with the help of @mraspaud that the tests are wrong - the wvl_coord is never in reality a Dask array, it is always a numpy array. So, adjusting the tests to reflect real use cases fixes this.
Then I agree that the rayleigh.py and in particular the get_reflectance method needs refactoring. The method is way to long and complext, and the handling of dask versus non-dask use cases is opaque. Maybe better to treat that in a separate issue and PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants