Divyasreepat's picture
Update README.md with new model card content
e8cfd5c verified
---
library_name: keras-hub
---
### Model Overview
A Keras model implementing the RetinaNet meta-architecture.
Implements the RetinaNet architecture for object detection. The constructor
requires `num_classes`, `bounding_box_format`, and a backbone. Optionally,
a custom label encoder, and prediction decoder may be provided.
__Arguments__
- __num_classes__: the number of classes in your dataset excluding the
background class. Classes should be represented by integers in the
range [0, num_classes).
- __bounding_box_format__: The format of bounding boxes of input dataset.
Refer
[to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/)
for more details on supported bounding box formats.
- __backbone__: `keras.Model`. If the default `feature_pyramid` is used,
must implement the `pyramid_level_inputs` property with keys "P3", "P4",
and "P5" and layer names as values. A somewhat sensible backbone
to use in many cases is the:
`keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet")`
- __anchor_generator__: (Optional) a `keras_cv.layers.AnchorGenerator`. If
provided, the anchor generator will be passed to both the
`label_encoder` and the `prediction_decoder`. Only to be used when
both `label_encoder` and `prediction_decoder` are both `None`.
Defaults to an anchor generator with the parameterization:
`strides=[2**i for i in range(3, 8)]`,
`scales=[2**x for x in [0, 1 / 3, 2 / 3]]`,
`sizes=[32.0, 64.0, 128.0, 256.0, 512.0]`,
and `aspect_ratios=[0.5, 1.0, 2.0]`.
- __label_encoder__: (Optional) a keras.Layer that accepts an image Tensor, a
bounding box Tensor and a bounding box class Tensor to its `call()`
method, and returns RetinaNet training targets. By default, a
KerasCV standard `RetinaNetLabelEncoder` is created and used.
Results of this object's `call()` method are passed to the `loss`
object for `box_loss` and `classification_loss` the `y_true`
argument.
- __prediction_decoder__: (Optional) A `keras.layers.Layer` that is
responsible for transforming RetinaNet predictions into usable
bounding box Tensors. If not provided, a default is provided. The
default `prediction_decoder` layer is a
`keras_cv.layers.MultiClassNonMaxSuppression` layer, which uses
a Non-Max Suppression for box pruning.
- __feature_pyramid__: (Optional) A `keras.layers.Layer` that produces
a list of 4D feature maps (batch dimension included)
when called on the pyramid-level outputs of the `backbone`.
If not provided, the reference implementation from the paper will be used.
- __classification_head__: (Optional) A `keras.Layer` that performs
classification of the bounding boxes. If not provided, a simple
ConvNet with 3 layers will be used.
- __box_head__: (Optional) A `keras.Layer` that performs regression of the
bounding boxes. If not provided, a simple ConvNet with 3 layers
will be used.
## Example Usage
## Pretrained RetinaNet model
```
object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"retinanet_resnet50_fpn_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
```
## Fine-tune the pre-trained model
```python3
backbone = keras_hub.models.Backbone.from_preset(
"retinanet_resnet50_fpn_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"retinanet_resnet50_fpn_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
```
## Custom training the model
```python3
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
```
## Example Usage with Hugging Face URI
## Pretrained RetinaNet model
```
object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
```
## Fine-tune the pre-trained model
```python3
backbone = keras_hub.models.Backbone.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
```
## Custom training the model
```python3
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
```