Update README.md with new model card content

e8cfd5c verified 4 months ago

5.96 kB

	---
	library_name: keras-hub
	---
	### Model Overview
	A Keras model implementing the RetinaNet meta-architecture.

	Implements the RetinaNet architecture for object detection. The constructor
	requires `num_classes`, `bounding_box_format`, and a backbone. Optionally,
	a custom label encoder, and prediction decoder may be provided.


	__Arguments__


	- __num_classes__: the number of classes in your dataset excluding the
	background class. Classes should be represented by integers in the
	range [0, num_classes).
	- __bounding_box_format__: The format of bounding boxes of input dataset.
	Refer
	[to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/)
	for more details on supported bounding box formats.
	- __backbone__: `keras.Model`. If the default `feature_pyramid` is used,
	must implement the `pyramid_level_inputs` property with keys "P3", "P4",
	and "P5" and layer names as values. A somewhat sensible backbone
	to use in many cases is the:
	`keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet")`
	- __anchor_generator__: (Optional) a `keras_cv.layers.AnchorGenerator`. If
	provided, the anchor generator will be passed to both the
	`label_encoder` and the `prediction_decoder`. Only to be used when
	both `label_encoder` and `prediction_decoder` are both `None`.
	Defaults to an anchor generator with the parameterization:
	`strides=[2**i for i in range(3, 8)]`,
	`scales=[2**x for x in [0, 1 / 3, 2 / 3]]`,
	`sizes=[32.0, 64.0, 128.0, 256.0, 512.0]`,
	and `aspect_ratios=[0.5, 1.0, 2.0]`.
	- __label_encoder__: (Optional) a keras.Layer that accepts an image Tensor, a
	bounding box Tensor and a bounding box class Tensor to its `call()`
	method, and returns RetinaNet training targets. By default, a
	KerasCV standard `RetinaNetLabelEncoder` is created and used.
	Results of this object's `call()` method are passed to the `loss`
	object for `box_loss` and `classification_loss` the `y_true`
	argument.
	- __prediction_decoder__: (Optional) A `keras.layers.Layer` that is
	responsible for transforming RetinaNet predictions into usable
	bounding box Tensors. If not provided, a default is provided. The
	default `prediction_decoder` layer is a
	`keras_cv.layers.MultiClassNonMaxSuppression` layer, which uses
	a Non-Max Suppression for box pruning.
	- __feature_pyramid__: (Optional) A `keras.layers.Layer` that produces
	a list of 4D feature maps (batch dimension included)
	when called on the pyramid-level outputs of the `backbone`.
	If not provided, the reference implementation from the paper will be used.
	- __classification_head__: (Optional) A `keras.Layer` that performs
	classification of the bounding boxes. If not provided, a simple
	ConvNet with 3 layers will be used.
	- __box_head__: (Optional) A `keras.Layer` that performs regression of the
	bounding boxes. If not provided, a simple ConvNet with 3 layers
	will be used.

	## Example Usage
	## Pretrained RetinaNet model
	```
	object_detector = keras_hub.models.ImageObjectDetector.from_preset(
	"retinanet_resnet50_fpn_coco"
	)

	input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
	object_detector(input_data)
	```

	## Fine-tune the pre-trained model
	```python3
	backbone = keras_hub.models.Backbone.from_preset(
	"retinanet_resnet50_fpn_coco"
	)
	preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
	"retinanet_resnet50_fpn_coco"
	)
	model = RetinaNetObjectDetector(
	backbone=backbone,
	num_classes=len(CLASSES),
	preprocessor=preprocessor
	)
	```

	## Custom training the model
	```python3
	image_converter = keras_hub.layers.RetinaNetImageConverter(
	scale=1/255
	)

	preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
	image_converter=image_converter
	)
	# Load a pre-trained ResNet50 model.
	# This will serve as the base for extracting image features.
	image_encoder = keras_hub.models.Backbone.from_preset(
	"resnet_50_imagenet"
	)

	# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
	# backbone. The FPN creates multi-scale feature maps for better object detection
	# at different sizes.
	backbone = keras_hub.models.RetinaNetBackbone(
	image_encoder=image_encoder,
	min_level=3,
	max_level=5,
	use_p5=False
	)
	model = RetinaNetObjectDetector(
	backbone=backbone,
	num_classes=len(CLASSES),
	preprocessor=preprocessor
	)
	```

	## Example Usage with Hugging Face URI

	## Pretrained RetinaNet model
	```
	object_detector = keras_hub.models.ImageObjectDetector.from_preset(
	"hf://keras/retinanet_resnet50_fpn_coco"
	)

	input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
	object_detector(input_data)
	```

	## Fine-tune the pre-trained model
	```python3
	backbone = keras_hub.models.Backbone.from_preset(
	"hf://keras/retinanet_resnet50_fpn_coco"
	)
	preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
	"hf://keras/retinanet_resnet50_fpn_coco"
	)
	model = RetinaNetObjectDetector(
	backbone=backbone,
	num_classes=len(CLASSES),
	preprocessor=preprocessor
	)
	```

	## Custom training the model
	```python3
	image_converter = keras_hub.layers.RetinaNetImageConverter(
	scale=1/255
	)

	preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
	image_converter=image_converter
	)
	# Load a pre-trained ResNet50 model.
	# This will serve as the base for extracting image features.
	image_encoder = keras_hub.models.Backbone.from_preset(
	"resnet_50_imagenet"
	)

	# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
	# backbone. The FPN creates multi-scale feature maps for better object detection
	# at different sizes.
	backbone = keras_hub.models.RetinaNetBackbone(
	image_encoder=image_encoder,
	min_level=3,
	max_level=5,
	use_p5=False
	)
	model = RetinaNetObjectDetector(
	backbone=backbone,
	num_classes=len(CLASSES),
	preprocessor=preprocessor
	)
	```