Remove image

31c0925 almost 2 years ago

3.78 kB

	---
	license: apache-2.0
	inference: false
	pipeline_tag: image-classification
	datasets:
	- imagenet-1k
	---

	# Perceiver IO image classifier

	This model is a Perceiver IO model pretrained on ImageNet (14 million images, 1,000 classes). It is weight-equivalent
	to the [deepmind/vision-perceiver-fourier](https://huggingface.co/deepmind/vision-perceiver-fourier) model but based on
	implementation classes of the [perceiver-io](https://github.com/krasserm/perceiver-io) library. It can be created from
	the `deepmind/vision-perceiver-fourier` model with a library-specific [conversion utility](#model-conversion). Both
	models generate equal output for the same input.

	Content of the `deepmind/vision-perceiver-fourier` [model card](https://huggingface.co/deepmind/vision-perceiver-fourier)
	also applies to this model except [usage examples](#usage-examples). Refer to the linked card for further model and
	training details.

	## Model description

	The model is specif in Appendix A of the [Perceiver IO paper](https://arxiv.org/abs/2107.14795) (2D Fourier features).

	## Intended use and limitations

	The model can be used for image classification.

	## Usage examples

	To use this model you first need to [install](https://github.com/krasserm/perceiver-io/blob/main/README.md#installation)
	the `perceiver-io` library with extension `text`.

	```shell
	pip install perceiver-io[text]
	```

	Then the model can be used with PyTorch. Either use the model and image processor directly

	```python
	import requests
	from PIL import Image
	from transformers import AutoModelForImageClassification, AutoImageProcessor
	from perceiver.model.vision import image_classifier # auto-class registration

	repo_id = "krasserm/perceiver-io-img-clf"

	# An image of a baseball player from MS-COCO validation set
	url = "http://images.cocodataset.org/val2017/000000507223.jpg"
	image = Image.open(requests.get(url, stream=True).raw)

	model = AutoModelForImageClassification.from_pretrained(repo_id)
	processor = AutoImageProcessor.from_pretrained(repo_id)

	processed = processor(image, return_tensors="pt")
	prediction = model(**processed).logits.argmax(dim=-1)

	print(f"Predicted class = {model.config.id2label[prediction.item()]}")
	```
	```
	Predicted class = ballplayer, baseball player
	```

	or use an `image-classification` pipeline:

	```python
	import requests
	from PIL import Image
	from transformers import pipeline
	from perceiver.model.vision import image_classifier # auto-class registration

	repo_id = "krasserm/perceiver-io-img-clf"

	# An image of a baseball player from MS-COCO validation set
	url = "http://images.cocodataset.org/val2017/000000507223.jpg"
	image = Image.open(requests.get(url, stream=True).raw)

	classifier = pipeline("image-classification", model=repo_id)
	prediction = classifier(image)

	print(f"Predicted class = {prediction[0]['label']}")
	```
	```
	Predicted class = ballplayer, baseball player
	```

	## Model conversion

	The `krasserm/perceiver-io-img-clf` model has been created from the source `deepmind/vision-perceiver-fourier` model
	with:

	```python
	from perceiver.model.vision.image_classifier import convert_model

	convert_model(
	save_dir="krasserm/perceiver-io-img-clf",
	source_repo_id="deepmind/vision-perceiver-fourier",
	push_to_hub=True,
	)
	```

	## Citation

	```bibtex
	@article{jaegle2021perceiver,
	title={Perceiver IO: A General Architecture for Structured Inputs \& Outputs},
	author={Jaegle, Andrew and Borgeaud, Sebastian and Alayrac, Jean-Baptiste and Doersch, Carl and Ionescu, Catalin and Ding, David and Koppula, Skanda and Zoran, Daniel and Brock, Andrew and Shelhamer, Evan and others},
	journal={arXiv preprint arXiv:2107.14795},
	year={2021}
	}
	```