hassonofer commited on
Commit
9e9813f
·
verified ·
1 Parent(s): 380e396

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +103 -3
README.md CHANGED
@@ -1,3 +1,103 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - image-classification
4
+ - birder
5
+ library_name: birder
6
+ license: apache-2.0
7
+ ---
8
+
9
+ # Model Card for xcit_nano12_p16_il-common
10
+
11
+ XCiT image classification model. This model was trained on the `il-common` dataset (common bird species found in Israel).
12
+
13
+ The species list is derived from data available at <https://www.israbirding.com/checklist/>.
14
+
15
+ ## Model Details
16
+
17
+ - **Model Type:** Image classification and detection backbone
18
+ - **Model Stats:**
19
+ - Params (M): 3.0
20
+ - Input image size: 256 x 256
21
+ - **Dataset:** il-common (371 classes)
22
+
23
+ - **Papers:**
24
+ - XCiT: Cross-Covariance Image Transformers: <https://arxiv.org/abs/2106.09681>
25
+
26
+ ## Model Usage
27
+
28
+ ### Image Classification
29
+
30
+ ```python
31
+ import birder
32
+ from birder.inference.classification import infer_image
33
+
34
+ (net, class_to_idx, signature, rgb_stats) = birder.load_pretrained_model("xcit_nano12_p16_il-common", inference=True)
35
+
36
+ # Get the image size the model was trained on
37
+ size = birder.get_size_from_signature(signature)
38
+
39
+ # Create an inference transform
40
+ transform = birder.classification_transform(size, rgb_stats)
41
+
42
+ image = "path/to/image.jpeg" # or a PIL image
43
+ (out, _) = infer_image(net, image, transform)
44
+ # out is a NumPy array with shape of (1, num_classes)
45
+ ```
46
+
47
+ ### Image Embeddings
48
+
49
+ ```python
50
+ import birder
51
+ from birder.inference.classification import infer_image
52
+
53
+ (net, class_to_idx, signature, rgb_stats) = birder.load_pretrained_model("xcit_nano12_p16_il-common", inference=True)
54
+
55
+ # Get the image size the model was trained on
56
+ size = birder.get_size_from_signature(signature)
57
+
58
+ # Create an inference transform
59
+ transform = birder.classification_transform(size, rgb_stats)
60
+
61
+ image = "path/to/image.jpeg" # or a PIL image
62
+ (out, embedding) = infer_image(net, image, transform, return_embedding=True)
63
+ # embedding is a NumPy array with shape of (1, embedding_size)
64
+ ```
65
+
66
+ ### Detection Feature Map
67
+
68
+ ```python
69
+ from PIL import Image
70
+ import birder
71
+
72
+ (net, class_to_idx, signature, rgb_stats) = birder.load_pretrained_model("xcit_nano12_p16_il-common", inference=True)
73
+
74
+ # Get the image size the model was trained on
75
+ size = birder.get_size_from_signature(signature)
76
+
77
+ # Create an inference transform
78
+ transform = birder.classification_transform(size, rgb_stats)
79
+
80
+ image = Image.open("path/to/image.jpeg")
81
+ features = net.detection_features(transform(image).unsqueeze(0))
82
+ # features is a dict (stage name -> torch.Tensor)
83
+ print([(k, v.size()) for k, v in features.items()])
84
+ # Output example:
85
+ # [('stage1', torch.Size([1, 96, 96, 96])),
86
+ # ('stage2', torch.Size([1, 192, 48, 48])),
87
+ # ('stage3', torch.Size([1, 384, 24, 24])),
88
+ # ('stage4', torch.Size([1, 768, 12, 12]))]
89
+ ```
90
+
91
+ ## Citation
92
+
93
+ ```bibtex
94
+ @misc{elnouby2021xcitcrosscovarianceimagetransformers,
95
+ title={XCiT: Cross-Covariance Image Transformers},
96
+ author={Alaaeldin El-Nouby and Hugo Touvron and Mathilde Caron and Piotr Bojanowski and Matthijs Douze and Armand Joulin and Ivan Laptev and Natalia Neverova and Gabriel Synnaeve and Jakob Verbeek and Hervé Jegou},
97
+ year={2021},
98
+ eprint={2106.09681},
99
+ archivePrefix={arXiv},
100
+ primaryClass={cs.CV},
101
+ url={https://arxiv.org/abs/2106.09681},
102
+ }
103
+ ```