saninmohammedn commited on
Commit
35d3561
·
verified ·
1 Parent(s): c92382c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -67
README.md CHANGED
@@ -1,67 +1,65 @@
1
- ---
2
- license: other
3
- tags:
4
- - vision
5
- - image-segmentation
6
- datasets:
7
- - coco
8
- widget:
9
- - src: http://images.cocodataset.org/val2017/000000039769.jpg
10
- example_title: Cats
11
- - src: http://images.cocodataset.org/val2017/000000039770.jpg
12
- example_title: Castle
13
- ---
14
-
15
- # Mask2Former
16
-
17
- Mask2Former model trained on Cityscapes semantic segmentation (large-sized version, Swin backbone). It was introduced in the paper [Masked-attention Mask Transformer for Universal Image Segmentation
18
- ](https://arxiv.org/abs/2112.01527) and first released in [this repository](https://github.com/facebookresearch/Mask2Former/).
19
-
20
- Disclaimer: The team releasing Mask2Former did not write a model card for this model so this model card has been written by the Hugging Face team.
21
-
22
- ## Model description
23
-
24
- Mask2Former addresses instance, semantic and panoptic segmentation with the same paradigm: by predicting a set of masks and corresponding labels. Hence, all 3 tasks are treated as if they were instance segmentation. Mask2Former outperforms the previous SOTA,
25
- [MaskFormer](https://arxiv.org/abs/2107.06278) both in terms of performance an efficiency by (i) replacing the pixel decoder with a more advanced multi-scale deformable attention Transformer, (ii) adopting a Transformer decoder with masked attention to boost performance without
26
- without introducing additional computation and (iii) improving training efficiency by calculating the loss on subsampled points instead of whole masks.
27
-
28
- ![model image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/mask2former_architecture.png)
29
-
30
- ## Intended uses & limitations
31
-
32
- You can use this particular checkpoint for panoptic segmentation. See the [model hub](https://huggingface.co/models?search=mask2former) to look for other
33
- fine-tuned versions on a task that interests you.
34
-
35
- ### How to use
36
-
37
- Here is how to use this model:
38
-
39
- ```python
40
- import requests
41
- import torch
42
- from PIL import Image
43
- from transformers import AutoImageProcessor, Mask2FormerForUniversalSegmentation
44
-
45
-
46
- # load Mask2Former fine-tuned on Cityscapes semantic segmentation
47
- processor = AutoImageProcessor.from_pretrained("facebook/mask2former-swin-large-cityscapes-semantic")
48
- model = Mask2FormerForUniversalSegmentation.from_pretrained("facebook/mask2former-swin-large-cityscapes-semantic")
49
-
50
- url = "http://images.cocodataset.org/val2017/000000039769.jpg"
51
- image = Image.open(requests.get(url, stream=True).raw)
52
- inputs = processor(images=image, return_tensors="pt")
53
-
54
- with torch.no_grad():
55
- outputs = model(**inputs)
56
-
57
- # model predicts class_queries_logits of shape `(batch_size, num_queries)`
58
- # and masks_queries_logits of shape `(batch_size, num_queries, height, width)`
59
- class_queries_logits = outputs.class_queries_logits
60
- masks_queries_logits = outputs.masks_queries_logits
61
-
62
- # you can pass them to processor for postprocessing
63
- predicted_semantic_map = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
64
- # we refer to the demo notebooks for visualization (see "Resources" section in the Mask2Former docs)
65
- ```
66
-
67
- For more code examples, we refer to the [documentation](https://huggingface.co/docs/transformers/master/en/model_doc/mask2former).
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - vision
5
+ - image-segmentation
6
+ datasets:
7
+ - coco
8
+ widget:
9
+ - src: http://images.cocodataset.org/val2017/000000039769.jpg
10
+ example_title: Cats
11
+ - src: http://images.cocodataset.org/val2017/000000039770.jpg
12
+ example_title: Castle
13
+ ---
14
+
15
+
16
+
17
+ ### How to use
18
+
19
+ Here is how to use this model:
20
+
21
+ ```python
22
+ from transformers import AutoImageProcessor, Mask2FormerForUniversalSegmentation
23
+ from PIL import Image
24
+ import matplotlib.pyplot as plt
25
+
26
+ # Load the processor and model
27
+ model_name = "saninmohammedn/mask2former-deployment"
28
+ processor = AutoImageProcessor.from_pretrained(model_name)
29
+ model = Mask2FormerForUniversalSegmentation.from_pretrained(model_name)
30
+
31
+ # Load an input image
32
+ image_path = "your_image.jpg" # Replace with your image path
33
+ image = Image.open(image_path).convert("RGB")
34
+
35
+ # Prepare the image for the model
36
+ inputs = processor(images=image, return_tensors="pt")
37
+
38
+ # Perform inference
39
+ with torch.no_grad():
40
+ outputs = model(**inputs)
41
+
42
+ # Post-process the predicted segmentation map
43
+ predicted_map = processor.post_process_semantic_segmentation(
44
+ outputs, target_sizes=[image.size[::-1]]
45
+ )[0].cpu().numpy()
46
+
47
+ # Visualize the input and predicted segmentation map
48
+ plt.figure(figsize=(10, 5))
49
+
50
+ # Display original image
51
+ plt.subplot(1, 2, 1)
52
+ plt.imshow(image)
53
+ plt.title("Original Image")
54
+ plt.axis("off")
55
+
56
+ # Display predicted segmentation map
57
+ plt.subplot(1, 2, 2)
58
+ plt.imshow(predicted_map, cmap="jet")
59
+ plt.title("Predicted Segmentation Map")
60
+ plt.axis("off")
61
+
62
+ plt.tight_layout()
63
+ plt.show()
64
+
65
+ ```