leo-q8 commited on
Commit
b472451
·
verified ·
1 Parent(s): 54ab122

Create README.md (#1)

Browse files

- Create README.md (96d84077226f8ce0673a226df8baf53f89f90abd)

Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ## Introduction
3
+
4
+ **PP-DocLayoutV2** is a dedicated lightweight model for layout analysis, focusing specifically on element detection, classification, and reading order
5
+ prediction.
6
+
7
+
8
+ ## **Model Architecture**
9
+
10
+ PP-DocLayoutV2 is composed of two sequentially connected networks. The first is an RT-DETR-based detection model that performs layout element detection and classification. The detected bounding boxes and class labels are then passed to a subsequent pointer network, which is responsible for ordering these layout elements.
11
+
12
+ <div align="center">
13
+ <img src="https://huggingface.co/datasets/PaddlePaddle/PaddleOCR-VL_demo/resolve/main/imgs/PP-DocLayoutV2.png" width="800"/>
14
+ </div>
15
+
16
+
17
+ ## Usage
18
+
19
+ ### Install Dependencies
20
+
21
+ Install [PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick) and [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR):
22
+
23
+ ```bash
24
+ python -m pip install paddlepaddle-gpu==3.2.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
25
+ python -m pip install -U "paddleocr[doc-parser]"
26
+ python -m pip install https://paddle-whl.bj.bcebos.com/nightly/cu126/safetensors/safetensors-0.6.2.dev0-cp38-abi3-linux_x86_64.whl
27
+ ```
28
+
29
+ > For Windows users, please use WSL or a Docker container.
30
+
31
+
32
+ ### Basic Usage
33
+
34
+ Python API usage:
35
+
36
+ ```python
37
+ from paddleocr import LayoutDetection
38
+
39
+ model = LayoutDetection(model_name="PP-DocLayoutV2")
40
+ output = model.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/layout.jpg", batch_size=1, layout_nms=True)
41
+ for res in output:
42
+ res.print()
43
+ res.save_to_img(save_path="./output/")
44
+ res.save_to_json(save_path="./output/res.json")
45
+ ```
46
+
47
+ **For more usage details and parameter explanations, see the [documentation](https://www.paddleocr.ai/latest/en/version3.x/module_usage/layout_analysis.html).**
48
+
49
+
50
+ ## Citation
51
+
52
+ If you find PaddleOCR-VL helpful, feel free to give us a star and citation.
53
+
54
+ ```bibtex
55
+ @misc{cui2025paddleocrvlboostingmultilingualdocument,
56
+ title={PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model},
57
+ author={Cheng Cui and Ting Sun and Suyin Liang and Tingquan Gao and Zelun Zhang and Jiaxuan Liu and Xueqing Wang and Changda Zhou and Hongen Liu and Manhui Lin and Yue Zhang and Yubo Zhang and Handong Zheng and Jing Zhang and Jun Zhang and Yi Liu and Dianhai Yu and Yanjun Ma},
58
+ year={2025},
59
+ eprint={2510.14528},
60
+ archivePrefix={arXiv},
61
+ primaryClass={cs.CV},
62
+ url={https://arxiv.org/abs/2510.14528},
63
+ }
64
+ ```