Tingquan commited on
Commit
5d02f88
·
verified ·
1 Parent(s): 082de5b

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +39 -27
README.md CHANGED
@@ -1,5 +1,14 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
  # PP-DocBee2-3B
@@ -53,16 +62,19 @@ You can quickly experience the functionality with a single command:
53
  ```bash
54
  paddleocr doc_vlm \
55
  --model_name PP-DocBee2-3B \
56
- -i "{'image': 'https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png', 'query': '识别这份表格的内容, markdown格式输出'}"
57
  ```
58
 
59
- You can also integrate the model inference of the text recognition module into your project. Before running the following code, please download the sample image to your local machine.
60
 
61
  ```python
62
  from paddleocr import DocVLM
63
  model = DocVLM(model_name="PP-DocBee2-3B")
64
  results = model.predict(
65
- input={"image": "medal_table.png", "query": "识别这份表格的内容, 以markdown格式输出"},
 
 
 
66
  batch_size=1
67
  )
68
  for res in results:
@@ -73,29 +85,29 @@ for res in results:
73
  After running, the obtained result is as follows:
74
 
75
  ```bash
76
- {'res': {'image': 'medal_table.png', 'query': '识别这份表格的内容, 以markdown格式输出', 'result': '| 名次 | 国家/地区 | 金牌 | 银牌 | 铜牌 | 奖牌总数 |\n| --- | --- | --- | --- | --- | --- |\n| 1 | 中国(CHN | 48 | 22 | 30 | 100 |\n| 2 | 美国(USA | 36 | 39 | 37 | 112 |\n| 3 | 俄罗斯(RUS | 24 | 13 | 23 | 60 |\n| 4 | 英国(GBR | 19 | 13 | 19 | 51 |\n| 5 | 德国(GER | 16 | 11 | 14 | 41 |\n| 6 | 澳大利亚(AUS | 14 | 15 | 17 | 46 |\n| 7 | 韩国(KOR | 13 | 11 | 8 | 32 |\n| 8 | 日本(JPN | 9 | 8 | 8 | 25 |\n| 9 | 意大利(ITA | 8 | 9 | 10 | 27 |\n| 10 | 法国(FRA | 7 | 16 | 20 | 43 |\n| 11 | 荷兰(NED | 7 | 5 | 4 | 16 |\n| 12 | 乌克兰(UKR | 7 | 4 | 11 | 22 |\n| 13 | 肯尼亚(KEN | 6 | 4 | 6 | 16 |\n| 14 | 西班牙(ESP | 5 | 11 | 3 | 19 |\n| 15 | 牙买加(JAM | 5 | 4 | 2 | 11 |\n'}}
77
  ```
78
 
79
  The visualized result is as follows:
80
 
81
  ```bash
82
- | 名次 | 国家/地区 | 金牌 | 银牌 | 铜牌 | 奖牌总数 |
83
- | --- | --- | --- | --- | --- | --- |
84
- | 1 | 中国(CHN | 48 | 22 | 30 | 100 |
85
- | 2 | 美国(USA | 36 | 39 | 37 | 112 |
86
- | 3 | 俄罗斯(RUS | 24 | 13 | 23 | 60 |
87
- | 4 | 英国(GBR | 19 | 13 | 19 | 51 |
88
- | 5 | 德国(GER | 16 | 11 | 14 | 41 |
89
- | 6 | 澳大利亚(AUS | 14 | 15 | 17 | 46 |
90
- | 7 | 韩国(KOR | 13 | 11 | 8 | 32 |
91
- | 8 | 日本(JPN | 9 | 8 | 8 | 25 |
92
- | 9 | 意大利(ITA | 8 | 9 | 10 | 27 |
93
- | 10 | 法国(FRA | 7 | 16 | 20 | 43 |
94
- | 11 | 荷兰(NED | 7 | 5 | 4 | 16 |
95
- | 12 | 乌克兰(UKR | 7 | 4 | 11 | 22 |
96
- | 13 | 肯尼亚(KEN | 6 | 4 | 6 | 16 |
97
- | 14 | 西班牙(ESP | 5 | 11 | 3 | 19 |
98
- | 15 | 牙买加(JAM | 5 | 4 | 2 | 11 |
99
  ```
100
 
101
  For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/module_usage/doc_vlm.html#iii-quick-start).
@@ -112,18 +124,18 @@ The document understanding pipeline is an advanced document processing technolog
112
  Run a single command to quickly experience the OCR pipeline:
113
 
114
  ```bash
115
- paddleocr doc_understanding -i "{'image': 'https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png', 'query': '识别这份表格的内容, markdown格式输出'}"
116
  ```
117
 
118
  Results are printed to the terminal:
119
 
120
  ```json
121
- {'res': {'image': 'https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png', 'query': '识别这份表格的内容, 以markdown格式输出', 'result': '| 名次 | 国家/地区 | 金牌 | 银牌 | 铜牌 | 奖牌总数 |\n| --- | --- | --- | --- | --- | --- |\n| 1 | 中国(CHN | 48 | 22 | 30 | 100 |\n| 2 | 美国(USA | 36 | 39 | 37 | 112 |\n| 3 | 俄罗斯(RUS | 24 | 13 | 23 | 60 |\n| 4 | 英国(GBR | 19 | 13 | 19 | 51 |\n| 5 | 德国(GER | 16 | 11 | 14 | 41 |\n| 6 | 澳大利亚(AUS | 14 | 15 | 17 | 46 |\n| 7 | 韩国(KOR | 13 | 11 | 8 | 32 |\n| 8 | 日本(JPN | 9 | 8 | 8 | 25 |\n| 9 | 意大利(ITA | 8 | 9 | 10 | 27 |\n| 10 | 法国(FRA | 7 | 16 | 20 | 43 |\n| 11 | 荷兰(NED | 7 | 5 | 4 | 16 |\n| 12 | 乌克兰(UKR | 7 | 4 | 11 | 22 |\n| 13 | 肯尼亚(KEN | 6 | 4 | 6 | 16 |\n| 14 | 西班牙(ESP | 5 | 11 | 3 | 19 |\n| 15 | 牙买加(JAM | 5 | 4 | 2 | 11 |\n'}}
122
  ```
123
 
124
  If save_path is specified, the visualization results will be saved under `save_path`. The visualization output is shown below:
125
 
126
- ![image/png](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/doc_understanding/doc_understanding.png)
127
 
128
  The command-line method is for quick experience. For project integration, also only a few codes are needed as well:
129
 
@@ -135,8 +147,8 @@ pipeline = DocUnderstanding(
135
  )
136
  output = pipeline.predict(
137
  {
138
- "image": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png",
139
- "query": "识别这份表格的内容, markdown格式输出"
140
  }
141
  )
142
  for res in output:
@@ -144,7 +156,7 @@ for res in output:
144
  res.save_to_json("./output/")
145
  ```
146
 
147
- The default model used in pipeline is `PP-DocBee2-3B`, so it is not necessary that specifing to `PP-DocBee2-3B` by argument `doc_understanding_model_name`. But you can use the local model file by argument `doc_understanding_model_dir`. For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/pipeline_usage/doc_understanding.html#2-quick-start).
148
 
149
  ## Links
150
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: PaddleOCR
4
+ language:
5
+ - en
6
+ - zh
7
+ pipeline_tag: image-to-text
8
+ tags:
9
+ - OCR
10
+ - PaddlePaddle
11
+ - PaddleOCR
12
  ---
13
 
14
  # PP-DocBee2-3B
 
62
  ```bash
63
  paddleocr doc_vlm \
64
  --model_name PP-DocBee2-3B \
65
+ -i "{'image': 'https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png', 'query': 'Recognize the content of this table and output it in markdown format.'}"
66
  ```
67
 
68
+ You can also integrate the model inference of the document visual-language module into your project. Before running the following code, please download the sample image to your local machine.
69
 
70
  ```python
71
  from paddleocr import DocVLM
72
  model = DocVLM(model_name="PP-DocBee2-3B")
73
  results = model.predict(
74
+ input={
75
+ "image": "https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png",
76
+ "query": "Recognize the content of this table and output it in markdown format."
77
+ },
78
  batch_size=1
79
  )
80
  for res in results:
 
85
  After running, the obtained result is as follows:
86
 
87
  ```bash
88
+ {'res': {'image': 'medal_table_en.png', 'query': 'Recognize the content of this table and output it in markdown format', 'result': '| Rank | Country/Region | Gold | Silver | Bronze | Total Medals |\n|---|---|---|---|---|---|\n| 1 | China (CHN) | 48 | 22 | 30 | 100 |\n| 2 | United States (USA) | 36 | 39 | 37 | 112 |\n| 3 | Russia (RUS) | 24 | 13 | 23 | 60 |\n| 4 | Great Britain (GBR) | 19 | 13 | 19 | 51 |\n| 5 | Germany (GER) | 16 | 11 | 14 | 41 |\n| 6 | Australia (AUS) | 14 | 15 | 17 | 46 |\n| 7 | South Korea (KOR) | 13 | 11 | 8 | 32 |\n| 8 | Japan (JPN) | 9 | 8 | 8 | 25 |\n| 9 | Italy (ITA) | 8 | 9 | 10 | 27 |\n| 10 | France (FRA) | 7 | 16 | 20 | 43 |\n| 11 | Netherlands (NED) | 7 | 5 | 4 | 16 |\n| 12 | Ukraine (UKR) | 7 | 4 | 11 | 22 |\n| 13 | Kenya (KEN) | 6 | 4 | 6 | 16 |\n| 14 | Spain (ESP) | 5 | 11 | 3 | 19 |\n| 15 | Jamaica (JAM) | 5 | 4 | 2 | 11 |\n'}}
89
  ```
90
 
91
  The visualized result is as follows:
92
 
93
  ```bash
94
+ | Rank | Country/Region | Gold | Silver | Bronze | Total Medals |
95
+ |---|---|---|---|---|---|
96
+ | 1 | China (CHN) | 48 | 22 | 30 | 100 |
97
+ | 2 | United States (USA) | 36 | 39 | 37 | 112 |
98
+ | 3 | Russia (RUS) | 24 | 13 | 23 | 60 |
99
+ | 4 | Great Britain (GBR) | 19 | 13 | 19 | 51 |
100
+ | 5 | Germany (GER) | 16 | 11 | 14 | 41 |
101
+ | 6 | Australia (AUS) | 14 | 15 | 17 | 46 |
102
+ | 7 | South Korea (KOR) | 13 | 11 | 8 | 32 |
103
+ | 8 | Japan (JPN) | 9 | 8 | 8 | 25 |
104
+ | 9 | Italy (ITA) | 8 | 9 | 10 | 27 |
105
+ | 10 | France (FRA) | 7 | 16 | 20 | 43 |
106
+ | 11 | Netherlands (NED) | 7 | 5 | 4 | 16 |
107
+ | 12 | Ukraine (UKR) | 7 | 4 | 11 | 22 |
108
+ | 13 | Kenya (KEN) | 6 | 4 | 6 | 16 |
109
+ | 14 | Spain (ESP) | 5 | 11 | 3 | 19 |
110
+ | 15 | Jamaica (JAM) | 5 | 4 | 2 | 11 |
111
  ```
112
 
113
  For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/module_usage/doc_vlm.html#iii-quick-start).
 
124
  Run a single command to quickly experience the OCR pipeline:
125
 
126
  ```bash
127
+ paddleocr doc_understanding -i "{'image': 'https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png', 'query': 'Recognize the content of this table and output it in markdown format.'}"
128
  ```
129
 
130
  Results are printed to the terminal:
131
 
132
  ```json
133
+ {'res': {'image': 'medal_table_en.png', 'query': 'Recognize the content of this table and output it in markdown format', 'result': '| Rank | Country/Region | Gold | Silver | Bronze | Total Medals |\n|---|---|---|---|---|---|\n| 1 | China (CHN) | 48 | 22 | 30 | 100 |\n| 2 | United States (USA) | 36 | 39 | 37 | 112 |\n| 3 | Russia (RUS) | 24 | 13 | 23 | 60 |\n| 4 | Great Britain (GBR) | 19 | 13 | 19 | 51 |\n| 5 | Germany (GER) | 16 | 11 | 14 | 41 |\n| 6 | Australia (AUS) | 14 | 15 | 17 | 46 |\n| 7 | South Korea (KOR) | 13 | 11 | 8 | 32 |\n| 8 | Japan (JPN) | 9 | 8 | 8 | 25 |\n| 9 | Italy (ITA) | 8 | 9 | 10 | 27 |\n| 10 | France (FRA) | 7 | 16 | 20 | 43 |\n| 11 | Netherlands (NED) | 7 | 5 | 4 | 16 |\n| 12 | Ukraine (UKR) | 7 | 4 | 11 | 22 |\n| 13 | Kenya (KEN) | 6 | 4 | 6 | 16 |\n| 14 | Spain (ESP) | 5 | 11 | 3 | 19 |\n| 15 | Jamaica (JAM) | 5 | 4 | 2 | 11 |\n'}}
134
  ```
135
 
136
  If save_path is specified, the visualization results will be saved under `save_path`. The visualization output is shown below:
137
 
138
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/kFGo9nlHuHs2uyN1voSTg.png)
139
 
140
  The command-line method is for quick experience. For project integration, also only a few codes are needed as well:
141
 
 
147
  )
148
  output = pipeline.predict(
149
  {
150
+ "image": "https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png",
151
+ "query": "Recognize the content of this table and output it in markdown format."
152
  }
153
  )
154
  for res in output:
 
156
  res.save_to_json("./output/")
157
  ```
158
 
159
+ The default model used in pipeline is `PP-DocBee2-3B`, so you don't have to specify `PP-DocBee2-3B` for the `doc_understanding_model_name argument`, but you can use the local model file by argument `doc_understanding_model_dir`. For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/pipeline_usage/doc_understanding.html#2-quick-start).
160
 
161
  ## Links
162