Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mhs/das 2108/handle no data #6

Merged
merged 17 commits into from
Apr 17, 2024
Merged

Conversation

flamingbear
Copy link
Member

@flamingbear flamingbear commented Apr 16, 2024

Description

This PR updates HyBIG to appropriately handle masked data. Rioxarray's mask_and_scale masks bad data some details here
This PR ensures that data that comes in as "bad" according to rioxarray is transparent, i.e. not shown, in the final output image.

This corrects a bug where data with nans was normalized incorrectly leading to all black output images.

Code and tests were updated to ensure any image, 1 band raster without associated colortable, 1band with colortable, a 3band raster and 4band raster are all tested with missing/bad data.

Jira Issue ID

DAS-2108

Local Test Steps

build and run the image and tests.

❯ ./bin/build-image && ./bin/build-test && ./bin/run-test

Deploy the new image to your local Harmony-In-A-Box.

This has been associated with HGA again and can't be tested in harmony at the moment.
Ensure you're on the VPN, associate HyBIG with C1260737748-ASDC_DEV2 and Test PREFIRE data with this command:
http://localhost:3000/C1260737748-ASDC_DEV2/ogc-api-coverages/1.0.0/collections/all/coverage/rangeset?forceAsync=true&granuleId=G1262401617-ASDC_DEV2&format=image%2Fpng&outputCrs=EPSG%3A4326&skipPreview=true

Alternatively download a test file here:
https://data.asdc.sit.earthdata.nasa.gov/asdc2-sit-protected/browse/PREFIRE/PREFIRE_SAT2_2B-FLX_R00/2021.07.21/PREFIRE_SAT2_2B-FLX_S07_R00_20210721044449_03042.nc.G00.tif

And run it through this gist.
https://gist.github.com/flamingbear/ce1a41962a65766c5230bab736cc1002

visualize the output in qgis.

PR Acceptance Checklist

  • Jira ticket acceptance criteria met.
  • CHANGELOG.md updated to include high level summary of PR changes.
  • docker/service_version.txt updated if publishing a release.
  • Tests added/updated and passing.
  • [N/A] Documentation updated (if needed).

test_convert_singleband_to_raster_without_colortable is updated to use a data
array directly as well as include a missing data check.

Missing data coded as np.nan is replaced by the TRANSPARENT_RGBA value (0,0,0,0)
@flamingbear flamingbear marked this pull request as ready for review April 16, 2024 15:26
Copy link
Member

@owenlittlejohns owenlittlejohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I ran the example from the description and instead of getting entirely black tiles, I got mostly transparent tiles with trajectories/swaths of populated data on them.

I had one question (that probably won't lead to any changes) about the range, but that's mainly for my own understanding.

harmony_browse_image_generator/browse.py Show resolved Hide resolved
harmony_browse_image_generator/browse.py Show resolved Hide resolved
harmony_browse_image_generator/browse.py Show resolved Hide resolved
@@ -7,6 +7,8 @@

import requests
from harmony.message import Source as HarmonySource
import numpy as np
from numpy import ndarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It feels a bit weird importing all of numpy and then also separately np.ndarray. Not a big deal, but just feels a bit odd.

Copy link
Member Author

@flamingbear flamingbear Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. I'm surprised I can't have ndarray in a TYPE_CHECKING guard, because that's all it's used for. I can change them all to np.ndarray, but I remember the format got black wonky.

Ok, I tried again. I can update it by putting my type in single quotes. Is that better?

from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from numpy import ndarray

And then

def remove_alpha(raster: 'ndarray') -> tuple['ndarray', 'ndarray', None]:
    """remove alpha layer when it exists."""
    if raster.shape[0] == 4:
        return raster[0:3, :, :], raster[3, :, :]
    return raster, None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that even right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'm worried this next comment is me missing something, but couldn't you just stick with the single import (import numpy as np) and then:

def remove_alpha(raster: np.ndarray) -> tuple[np.ndarray, np.ndarray, None]:

I tried a toy model locally, and that seemed okay.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could. This was why I hadn't elsewhere: "but I remember the format got black wonky"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I don't have strong opinions I suppose 9a7902f

from osgeo_utils.auxiliary.color_palette import ColorPalette
from PIL import Image
from rasterio.io import DatasetReader
from rasterio.plot import reshape_as_image, reshape_as_raster
from rasterio.warp import Resampling, reproject
from rioxarray import open_rasterio
from xarray import DataArray
from numpy import ndarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh - weird that this ordering happened. Feels a bit off, but I guess I trust it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this too*, I think we don't have that set in the pre-commit.

  • and then I forgot to check it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3dbba3c

I had a bad pyproject.toml, not sure why our precommit didn't catch it though. I will look

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should propagate this to the other repos as we touch them.
da4a015

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - I might just put up a couple of PRs now. (Some of those repos don't get much love and we might forget)

@flamingbear
Copy link
Member Author

@owenlittlejohns You want to re-review the latest changes?


expected_raster = np.array(
[
[
[0, 104, 198, 255],
[0, 0, 198, 255],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you change this from 104 to 0 to demonstrate transparent pixel behavior?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes, this is the expected array, I added a missing value at the [0,1] location, and these changes are the ones reflected in the output from the missing data.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. You changed the code so the expected array changed.

@lyonthefrog
Copy link
Collaborator

lyonthefrog commented Apr 17, 2024

I ran the test using the test-prefire.py script, and it looks like I got the expected output. The left image shows the original problem, and the right image is the corrected output using this branch.

das-2108-results

I added a few questions, but they're mainly just for my own understanding. Cool stuff!

@lyonthefrog lyonthefrog self-requested a review April 17, 2024 21:03
@flamingbear flamingbear merged commit fd0927c into main Apr 17, 2024
4 checks passed
@flamingbear flamingbear deleted the mhs/DAS-2108/handle-no-data branch April 17, 2024 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants