Add remaining bbox utils #1804

sineeli · 2024-08-29T15:46:54Z

Add remaining utils related to bounding box.

mattdangerw

Just a lot of question for now. I suspect we should probably just land the bbox stuff for now (which we won't release here anyway), skip the layers (or keep them in a model dir), and start working through an individual model.

mattdangerw · 2024-09-03T16:41:51Z

keras_nlp/src/bounding_box/ensure_tensor.py

+from keras_nlp.src.utils import tensor_utils
+
+
+@keras_nlp_export("keras_nlp.bounding_box.ensure_tensor")


what exactly does this need to do? I definitely wouldn't expose it. And we have a lot of utils already in tensor_utils.py that do stuff similar to this.

What's the expected usage?

Yes, I thought so as well but this was handling dictionary format of inputs, we can add this at model level.

mattdangerw · 2024-09-03T16:56:40Z

keras_nlp/src/utils/tensor_utils.py

@@ -204,3 +204,12 @@ def any_equal(inputs, values, padding_mask):
 output = ops.logical_or(output, value_equality)

 return ops.logical_and(output, padding_mask)
+
+
+def ensure_tensor(inputs, dtype=None):


isn't this exactly what convert_to_tensor does already?

yes, just for dictionary scenario

mattdangerw · 2024-09-03T16:59:03Z

keras_nlp/src/layers/modeling/anchor_generator.py

+
+
+@keras_nlp_export("keras_nlp.layers.AnchorGenerator")
+class AnchorGenerator(keras.layers.Layer):


I think for all of these layers, it might make sense to add these to individual models for now? Is there a reason they need to be exposed like this?

Overall our plan is to move bounding box formats and utilities to core Keras. We should make sure we aren't making our lives more difficult by adding layers here that won't live here long term. And if we really did want to add these as exposed layers, there's a lot of work they would need likely.

Where are these layers used? For what models?

Let's start with one model, plan a high-level API first, and only bring in what we need, with much of the implementation details living in the model dir.

AnchorGenerator, NonMaxSupression and BoxMatcher layers are used in all Object Detection Models.

YOLO

Faster R-CNN

RetinaNet

Mas R-CNN

If any extra Object Detection Models are added in future these are mandatory layers across all these models. So added as part of layers.

If these layers are available from core directly we can isolate these to only particular model dir RetinaNet for now.

SG! Well I think adding these in a separate PR will definitely be easier, and along with a model should help us suss out the task setup. I think we probably need so more substantial rewrites before we expose these. In particular:

I doubt we want to rely on raggeds as much. Let's write this the "jax and torch" way. So if we have ragged inputs, we probably want to convert them to padded dense inputs as a first step.

We should use common test cases in the base test case, like our other layers.

We should rewrite docstrings, etc to match local style.

The first of these is probably the trickiest, but it will be easiest to do when we have a actual model to work with.

@mattdangerw,

So in the next PR make changes as per "jax and torch" for layers and write these layers as of now at model directory.

I will try to write one layer at a time in the PR and follow the points who have mentioned.

Thanks!

mattdangerw · 2024-09-03T17:03:38Z

keras_nlp/src/bounding_box/mask_invalid_detections.py

+ "to be used with `bounding_box.mask_invalid_detections()`."
+ )
+
+ boxes = bounding_boxes.get("boxes")


Where is this expected to come from? Is a dict with this structure standard for certain models? What generates this?

Probably we should add this along with the model code that needs it. And possibly unexposed.

Object Detection models has standard two inputs as dictionary format input and output which are basically images and boudning_boxes itself is another dictionary having boxes and classes as keys.

So for any model which return bounding box outputs this util is used to mask invalid outputs and mark them as -1.

…e supports them

mattdangerw

lgtm! not looking at the code too closely as I believe we are still going to lower this to keras, so this is just a temp unblocker

sineeli · 2024-09-04T18:23:20Z

lgtm! not looking at the code too closely as I believe we are still going to lower this to keras, so this is just a temp unblocker

Sure

* - Add formats, iou, utils for bounding box * - Add `AnchorGenerator`, `BoxMatcher` and `NonMaxSupression` layers * - Remove scope_name not required. * use default keras name scope * - Correct format error * - Remove layers as of now and keep them at model level till keras core supports them * - Correct api_gen

* Add VGG16 backbone (#1737) * Agg Vgg16 backbone * update names * update tests * update test * add image classifier * incorporate review comments * Update test case * update backbone test * add image classifier * classifier cleanup * code reformat * add vgg16 image classifier * make vgg generic * update doc string * update docstring * add classifier test * update tests * update docstring * address review comments * code reformat * update the configs * address review comments * fix task saved model test * update init * code reformatted * Add `ResNetBackbone` and `ResNetImageClassifier` (#1765) * Add ResNetV1 and ResNetV2 * Address comments * Add CSP DarkNet backbone and classifier (#1774) * Add CSP DarkNet * Add CSP DarkNet * snake_case function names * change use_depthwise to block_type * Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` (#1769) * Add FeaturePyramidBackbone and update ResNetBackbone * Simplify the implementation * Fix CI * Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone * Add conversion implementation * Update docstrings * Address comments * Add DenseNet (#1775) * Add DenseNet * fix testcase * address comments * nit * fix lint errors * move description * Add ViTDetBackbone (#1776) * add vit det vit_det_backbone * update docstring * code reformat * fix tests * address review comments * bump year on all files * address review comments * rename backbone * fix tests * change back to ViT * address review comments * update image shape * Add Mix transformer (#1780) * Add MixTransformer * fix testcase * test changes and comments * lint fix * update config list * modify testcase for 2 layers * update input_image_shape -> image_shape (#1785) * update input_image_shape -> image_shape * update docstring example * code reformat * update tests * Create __init__.py (#1788) add missing __init__ file to vit_det * Hack package build script to rename to keras-hub (#1793) This is a temporary way to test out the keras-hub branch. - Does a global rename of all symbols during package build. - Registers the "old" name on symbol export for saving compat. - Adds a github action to publish every commit to keras-hub as a new package. - Removes our descriptions on PyPI temporarily, until we want to message this more broadly. * Add CLIP and T5XXL for StableDiffusionV3 (#1790) * Add `CLIPTokenizer`, `T5XXLTokenizer`, `CLIPTextEncoder` and `T5XXLTextEncoder`. * Make CLIPTextEncoder as Backbone * Add `T5XXLPreprocessor` and remove `T5XXLTokenizer` Add `CLIPPreprocessor` * Use `tf = None` at the top * Replace manual implementation of `CLIPAttention` with `MultiHeadAttention` * Add Bounding Box Utils (#1791) * Bounding box utils * - Correct test cases * - Remove hard tensorflow dtype * - fix api gen * - Fix import for test cases - Use setup for converters test case * - fix api_gen issue * - FIx api gen * - Fix api gen error * - Correct test cases as per new api changes * mobilenet_v3 added in keras-nlp (#1782) * mobilenet_v3 added in keras-nlp * minor bug fixed in mobilenet_v3_backbone * formatting corrected * refactoring backbone * correct_pad_downsample method added * refactoring backbone * parameters updated * Testcaseupdated, expected output shape corrected * code formatted with black * testcase updated * refactoring and description added * comments updated * added mobilenet v1 and v2 * merge conflict resolved * version arg removed, and config options added * input_shape changed to image_shape in arg * config updated * input shape corrected * comments resolved * activation function format changed * minor bug fixed * minor bug fixed * added vision_backbone_test * channel_first bug resolved * channel_first cases working * comments resolved * formatting fixed * refactoring --------- Co-authored-by: ushareng <[email protected]> * Pkgoogle/efficient net migration (#1778) * migrating efficientnet models to keras-hub * merging changes from other sources * autoformatting pass * initial consolidation of efficientnet_backbone * most updates and removing separate implementation * cleanup, autoformatting, keras generalization * removed layer examples outside of effiicient net * many, mainly documentation changes, small test fixes * Add the ResNet_vd backbone (#1766) * Add ResNet_vd to ResNet backbone * Addressed requested parameter changes * Fixed tests and updated comments * Added new parameters to docstring * Add `VAEImageDecoder` for StableDiffusionV3 (#1796) * Add `VAEImageDecoder` for StableDiffusionV3 * Use `keras.Model` for `VAEImageDecoder` and follows the coding style in `VAEAttention` * Replace `Backbone` with `keras.Model` in `CLIPTextEncoder` and `T5XXLTextEncoder` (#1802) * Add pyramid output for densenet, cspDarknet (#1801) * add pyramid outputs * fix testcase * format fix * make common testcase for pyramid outputs * change default shape * simplify testcase * test case change and add channel axis * Add `MMDiT` for StableDiffusionV3 (#1806) * Add `MMDiT` * Update * Update * Update implementation * Add remaining bbox utils (#1804) * - Add formats, iou, utils for bounding box * - Add `AnchorGenerator`, `BoxMatcher` and `NonMaxSupression` layers * - Remove scope_name not required. * use default keras name scope * - Correct format error * - Remove layers as of now and keep them at model level till keras core supports them * - Correct api_gen * Fix timm conversion for rersnet (#1814) * Add `StableDiffusion3` * Fix `_normalize_inputs` * Separate CLIP encoders from SD3 backbone. * Simplify `text_to_image` function. * Address comments * Minor update and add docstrings. * Add VGG16 backbone (#1737) * Agg Vgg16 backbone * update names * update tests * update test * add image classifier * incorporate review comments * Update test case * update backbone test * add image classifier * classifier cleanup * code reformat * add vgg16 image classifier * make vgg generic * update doc string * update docstring * add classifier test * update tests * update docstring * address review comments * code reformat * update the configs * address review comments * fix task saved model test * update init * code reformatted * Add `ResNetBackbone` and `ResNetImageClassifier` (#1765) * Add ResNetV1 and ResNetV2 * Address comments * Add CSP DarkNet backbone and classifier (#1774) * Add CSP DarkNet * Add CSP DarkNet * snake_case function names * change use_depthwise to block_type * Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` (#1769) * Add FeaturePyramidBackbone and update ResNetBackbone * Simplify the implementation * Fix CI * Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone * Add conversion implementation * Update docstrings * Address comments * Add DenseNet (#1775) * Add DenseNet * fix testcase * address comments * nit * fix lint errors * move description * Add ViTDetBackbone (#1776) * add vit det vit_det_backbone * update docstring * code reformat * fix tests * address review comments * bump year on all files * address review comments * rename backbone * fix tests * change back to ViT * address review comments * update image shape * Add Mix transformer (#1780) * Add MixTransformer * fix testcase * test changes and comments * lint fix * update config list * modify testcase for 2 layers * update input_image_shape -> image_shape (#1785) * update input_image_shape -> image_shape * update docstring example * code reformat * update tests * Create __init__.py (#1788) add missing __init__ file to vit_det * Hack package build script to rename to keras-hub (#1793) This is a temporary way to test out the keras-hub branch. - Does a global rename of all symbols during package build. - Registers the "old" name on symbol export for saving compat. - Adds a github action to publish every commit to keras-hub as a new package. - Removes our descriptions on PyPI temporarily, until we want to message this more broadly. * Add CLIP and T5XXL for StableDiffusionV3 (#1790) * Add `CLIPTokenizer`, `T5XXLTokenizer`, `CLIPTextEncoder` and `T5XXLTextEncoder`. * Make CLIPTextEncoder as Backbone * Add `T5XXLPreprocessor` and remove `T5XXLTokenizer` Add `CLIPPreprocessor` * Use `tf = None` at the top * Replace manual implementation of `CLIPAttention` with `MultiHeadAttention` * Add Bounding Box Utils (#1791) * Bounding box utils * - Correct test cases * - Remove hard tensorflow dtype * - fix api gen * - Fix import for test cases - Use setup for converters test case * - fix api_gen issue * - FIx api gen * - Fix api gen error * - Correct test cases as per new api changes * mobilenet_v3 added in keras-nlp (#1782) * mobilenet_v3 added in keras-nlp * minor bug fixed in mobilenet_v3_backbone * formatting corrected * refactoring backbone * correct_pad_downsample method added * refactoring backbone * parameters updated * Testcaseupdated, expected output shape corrected * code formatted with black * testcase updated * refactoring and description added * comments updated * added mobilenet v1 and v2 * merge conflict resolved * version arg removed, and config options added * input_shape changed to image_shape in arg * config updated * input shape corrected * comments resolved * activation function format changed * minor bug fixed * minor bug fixed * added vision_backbone_test * channel_first bug resolved * channel_first cases working * comments resolved * formatting fixed * refactoring --------- Co-authored-by: ushareng <[email protected]> * Pkgoogle/efficient net migration (#1778) * migrating efficientnet models to keras-hub * merging changes from other sources * autoformatting pass * initial consolidation of efficientnet_backbone * most updates and removing separate implementation * cleanup, autoformatting, keras generalization * removed layer examples outside of effiicient net * many, mainly documentation changes, small test fixes * Add the ResNet_vd backbone (#1766) * Add ResNet_vd to ResNet backbone * Addressed requested parameter changes * Fixed tests and updated comments * Added new parameters to docstring * Add `VAEImageDecoder` for StableDiffusionV3 (#1796) * Add `VAEImageDecoder` for StableDiffusionV3 * Use `keras.Model` for `VAEImageDecoder` and follows the coding style in `VAEAttention` * Replace `Backbone` with `keras.Model` in `CLIPTextEncoder` and `T5XXLTextEncoder` (#1802) * Add pyramid output for densenet, cspDarknet (#1801) * add pyramid outputs * fix testcase * format fix * make common testcase for pyramid outputs * change default shape * simplify testcase * test case change and add channel axis * Add `MMDiT` for StableDiffusionV3 (#1806) * Add `MMDiT` * Update * Update * Update implementation * Add remaining bbox utils (#1804) * - Add formats, iou, utils for bounding box * - Add `AnchorGenerator`, `BoxMatcher` and `NonMaxSupression` layers * - Remove scope_name not required. * use default keras name scope * - Correct format error * - Remove layers as of now and keep them at model level till keras core supports them * - Correct api_gen * Fix timm conversion for rersnet (#1814) * Fix * Update * Rename to diffuser and decoder * Define functional model * Merge from upstream/master * Delete old SD3 * Fix copyright * Rename to keras_hub * Address comments * Update * Fix CI * Fix bugs occurred in keras3.1 --------- Co-authored-by: Divyashree Sreepathihalli <[email protected]> Co-authored-by: Sachin Prasad <[email protected]> Co-authored-by: Matt Watson <[email protected]> Co-authored-by: Siva Sravana Kumar Neeli <[email protected]> Co-authored-by: Usha Rengaraju <[email protected]> Co-authored-by: ushareng <[email protected]> Co-authored-by: pkgoogle <[email protected]> Co-authored-by: gowthamkpr <[email protected]>

- Add formats, iou, utils for bounding box

09fbb08

sineeli requested a review from mattdangerw August 29, 2024 15:47

- Add AnchorGenerator, BoxMatcher and NonMaxSupression layers

c0f2f83

sineeli changed the title ~~- Add formats, iou, utils for bounding box~~ - Add AnchorGenerator, NonMaxSupression, BoxMatcher and remaining bbox utils Aug 29, 2024

- Remove scope_name not required.

20dedd6

sineeli changed the title ~~- Add AnchorGenerator, NonMaxSupression, BoxMatcher and remaining bbox utils~~ Add AnchorGenerator, NonMaxSupression, BoxMatcher and remaining bbox utils Aug 29, 2024

sineeli added 2 commits August 29, 2024 15:39

use default keras name scope

e921f2e

- Correct format error

f1978fb

divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Sep 3, 2024

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Sep 3, 2024

mattdangerw reviewed Sep 3, 2024

View reviewed changes

sineeli added 2 commits September 3, 2024 13:49

- Remove layers as of now and keep them at model level till keras cor…

410dd8a

…e supports them

- Correct api_gen

199f18e

sineeli requested a review from mattdangerw September 3, 2024 21:42

sineeli changed the title ~~Add AnchorGenerator, NonMaxSupression, BoxMatcher and remaining bbox utils~~ Add remaining bbox utils Sep 3, 2024

mattdangerw approved these changes Sep 4, 2024

View reviewed changes

mattdangerw merged commit 339669f into keras-team:keras-hub Sep 4, 2024
8 of 9 checks passed

sineeli deleted the sineeli/add-obj-det-utils branch September 17, 2024 21:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add remaining bbox utils #1804

Add remaining bbox utils #1804

sineeli commented Aug 29, 2024 •

edited

Loading

mattdangerw left a comment

mattdangerw Sep 3, 2024

sineeli Sep 3, 2024

mattdangerw Sep 3, 2024

sineeli Sep 3, 2024

mattdangerw Sep 3, 2024

sineeli Sep 3, 2024

mattdangerw Sep 3, 2024 •

edited

Loading

sineeli Sep 4, 2024

mattdangerw Sep 3, 2024

sineeli Sep 3, 2024

mattdangerw left a comment

sineeli commented Sep 4, 2024

		from keras_nlp.src.utils import tensor_utils


		@keras_nlp_export("keras_nlp.bounding_box.ensure_tensor")



		@keras_nlp_export("keras_nlp.layers.AnchorGenerator")
		class AnchorGenerator(keras.layers.Layer):

Add remaining bbox utils #1804

Add remaining bbox utils #1804

Conversation

sineeli commented Aug 29, 2024 • edited Loading

mattdangerw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdangerw Sep 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdangerw left a comment

Choose a reason for hiding this comment

sineeli commented Sep 4, 2024

sineeli commented Aug 29, 2024 •

edited

Loading

mattdangerw Sep 3, 2024 •

edited

Loading