Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.load() only loading overlap between scenes with HLS data? #177

Open
ithaca-oyler opened this issue Oct 7, 2024 · 4 comments
Open

.load() only loading overlap between scenes with HLS data? #177

ithaca-oyler opened this issue Oct 7, 2024 · 4 comments

Comments

@ithaca-oyler
Copy link

I'm currently using HLS data (both HLSL30.v2.0 and HLSS30.v2.0 collections) for a short time frame (5-16-2024 through 5-18-2024) and I noticed that only areas that have an overlap of both collections still have imagery in the final output. I'm at a bit of a loss as to why this may be occurring, but I suspect it may have to do with the way the lazy xarray dataset is built. I thought setting the nodata value might help (as mentioned in #162), but the output did not change. The HLS config contains information about band names, band values, cloud masking information, etc.

I am currently using odc-geo==0.4.6 and odc-stac==0.3.9.

This image shows what we expect to see:
image

And this image shows what is loaded in with data = stac.stac_load(item_collection, geopolygon=roi, chunks={"x": "auto", "y": "auto","time":"auto"}, output_crs=crs, dtype='int16', resolution=30, stac_cfg=hls_cfg )

image

I know there are a lot of moving parts in this issue, but any insights as to why something like this might be happening will be very helpful :) Thank you!

@Kirill888
Copy link
Member

@ithaca-oyler please provide replication code, what stac query you are running, how are you calling stac load etc. This certainly looks like broken nodata marker, do the data files have expected nodata field set? By default we trust nodata marker in the file over all other places. Do you get same problem when running without Dask?

@ithaca-oyler
Copy link
Author

Hi @Kirill888 -- attached is a zip file that contains a sample notebook, the config file, and an example of what one of the metadata files look like. According to the metadata, the data files expect to have nodata set to -9999 (listed as FILLVALUE in the metadata). The output images shown in my previous comment are the outputs from odc.algo xr_quantile at the end of the notebook. I have tried several different combinations of setting the nodata value in the config file, as a stac.load() call, and in xr_quantile() with no good results.

replication-code.zip

@Kirill888
Copy link
Member

@ithaca-oyler the code does way too much and requires complex environment to run. Please provide a simpler example using single band only. My guess is that data bands work probably fine, but Fmask band is not, need to specify nodata=255 for it. Please debug more on your end.

@ithaca-oyler
Copy link
Author

@Kirill888 Thank you for your patience while I put together a simpler example for you. I'm taking out the cloud masking step, but still making the change you suggested with specifying nodata=255 in the HLS config I was using. I hope to have an updated notebook made in the next day or so. In the meantime, I have done some additional debugging and I think you're on the right track with part of the issue being in the Fmask band. Fixing some of the HLS config helped and brought back some of the pieces of data that are between the HLS tiles.

Here's what specifying nodata=255 and removing the cloud masking step resulted in:
image

I also decided to experiment with the chunks setting in stac.load() and I was surprised to find changing up this setting produced different results. For the image above, the setting was chunks={"x": "auto", "y": "auto","time":"auto}. Here, I changed it to chunks={"x": 512, "y": 512,"time":"auto"}

image

And here I changed it to chunks={"x": 1024, "y": 1024,"time":"auto"}
image

Do you have any thoughts as to why the output might be changing with the chunk sizes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants