Better documentation
Browse files
README.md
CHANGED
@@ -1,9 +1,93 @@
|
|
1 |
---
|
2 |
license: cc0-1.0
|
3 |
tags:
|
4 |
-
- art
|
5 |
---
|
6 |
|
7 |
-
|
8 |
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc0-1.0
|
3 |
tags:
|
4 |
+
- art
|
5 |
---
|
6 |
|
7 |
+
# DeepLabV3+ ResNet50 for human body parts segmentation
|
8 |
|
9 |
+
This is a very simple ONNX model that can segment human body parts.
|
10 |
+
|
11 |
+
## Why this model
|
12 |
+
|
13 |
+
This model is a ONNX transposition of [keras-io/deeplabv3p-resnet50](https://huggingface.co/keras-io/deeplabv3p-resnet50)
|
14 |
+
where the provided model can segment human body parts. All the others models that I found was trained on
|
15 |
+
city segmentation.
|
16 |
+
|
17 |
+
The original model is built for old version of Keras and cannot be used with recent version of TensorFlow.
|
18 |
+
I translated the model to ONNX format.
|
19 |
+
|
20 |
+
## Usage
|
21 |
+
|
22 |
+
Get the `deeplabv3p-resnet50-human.onnx` file and use it with ONNXRuntime package.
|
23 |
+
|
24 |
+
The result of `model.run` is a `(1, 1, 512, 512, 21)` tensor:
|
25 |
+
|
26 |
+
- 1: number of output (you can squeeze it)
|
27 |
+
- 1: batch size (you can squeeze it)
|
28 |
+
- 512, 512: the size of the image (fixed)
|
29 |
+
- 21: number of classes, so you can take the `argmax`` of the tensor to get the class of each pixel
|
30 |
+
|
31 |
+
```python
|
32 |
+
import onnxruntime
|
33 |
+
from PIL import Image
|
34 |
+
|
35 |
+
img = Image.open(sys.argv[1] if len(sys.argv) > 1 else "image.jpg")
|
36 |
+
img = img.resize((512, 512))
|
37 |
+
img = np.array(img).astype(np.float32) / 127.5 - 1
|
38 |
+
|
39 |
+
input_name = model.get_inputs()[0].name
|
40 |
+
output_name = model.get_outputs()[0].name
|
41 |
+
result = model.run([output_name], {input_name: img})
|
42 |
+
result = np.array(result[0])
|
43 |
+
# argmax the classes, remove the batch size
|
44 |
+
result = result.argmax(axis=3).squeeze(0)
|
45 |
+
|
46 |
+
# get the masks
|
47 |
+
for i in range(21):
|
48 |
+
detected = result == i # get the detected pixels for the class i
|
49 |
+
# detected is a 512, 512 boolean array
|
50 |
+
mask = np.zeros_like(img)
|
51 |
+
mask[detected] = 255
|
52 |
+
Image.fromarray(mask).show() # or save, or return the mask...
|
53 |
+
```
|
54 |
+
|
55 |
+
## Classes index
|
56 |
+
|
57 |
+
This is the list of classes that the model can detect (some classes are not specifically identified, see below):
|
58 |
+
|
59 |
+
- 0: "background",
|
60 |
+
- 1: "unknown",
|
61 |
+
- 2: "hair",
|
62 |
+
- 3: "unknown",
|
63 |
+
- 4: "glasses",
|
64 |
+
- 5: "top-clothes",
|
65 |
+
- 6: "unknown",
|
66 |
+
- 7: "unknown",
|
67 |
+
- 8: "unknown",
|
68 |
+
- 9: "bottom-clothes",
|
69 |
+
- 10: "torso-skin",
|
70 |
+
- 11: "unknown",
|
71 |
+
- 12: "unknown",
|
72 |
+
- 13: "face",
|
73 |
+
- 14: "left-arm",
|
74 |
+
- 15: "right-arm",
|
75 |
+
- 16: "left-leg",
|
76 |
+
- 17: "right-leg",
|
77 |
+
- 18: "left-foot",
|
78 |
+
- 19: "right-foot",
|
79 |
+
|
80 |
+
## Known limitation
|
81 |
+
|
82 |
+
- The model could fail on portrait images, because the model was trained on "full body" images.
|
83 |
+
- There are some classes that I don't know what they are. I can't find the list of classes (help !).
|
84 |
+
- The model is not perfect, and can fail on some images. I'm not the author of the model, so I can't fix it.
|
85 |
+
|
86 |
+
## License
|
87 |
+
|
88 |
+
The [original model card](https://huggingface.co/keras-io/deeplabv3p-resnet50/blob/main/README.md) proposes the "CC0-1.0"
|
89 |
+
license. I don't know if it's the right license for the model, but I keep it.
|
90 |
+
|
91 |
+
> Anyway, thanks to the authors of the model for sharing it and to leave it open to use.
|
92 |
+
|
93 |
+
This means that you may use the model, share, modify, and distribute it without any restriction.
|