|
.. _algorithm_table_recognition: |
|
|
|
======================== |
|
Table Recognition Algorithm |
|
======================== |
|
|
|
Introduction |
|
================= |
|
|
|
Table recognition refers to the process of inputting a table image, identifying the table structure and content, and converting it into formats such as ``LaTeX`` or ``HTML``. |
|
|
|
Model Usage |
|
================= |
|
|
|
With the environment properly configured, you can run the table recognition algorithm script by directly executing ``scripts/table_parsing.py``. |
|
|
|
.. code:: shell |
|
|
|
$ python scripts/table_parsing.py --config configs/table_parsing.yaml |
|
|
|
Model Configuration |
|
----------------- |
|
|
|
.. code:: yaml |
|
|
|
inputs: assets/demo/table_parsing |
|
outputs: outputs/table_parsing |
|
tasks: |
|
table_parsing: |
|
model: table_parsing_struct_eqtable |
|
model_config: |
|
model_path: models/TabRec/StructEqTable |
|
max_new_tokens: 1024 |
|
max_time: 30 |
|
output_format: latex |
|
lmdeploy: False |
|
flash_attn: True |
|
|
|
- inputs/outputs: Define the input file path and table recognition result directory respectively |
|
- tasks: Define the task type, currently only including one table recognition task |
|
- model: Define the specific model type: currently using the `StructEqTable <https://github.com/UniModal4Reasoning/StructEqTable-Deploy>`_ table recognition model |
|
- model_config: Define the model configuration |
|
- model_path: Path to the model weights |
|
- max_new_tokens: Maximum number of tokens to generate, default is 1024, maximum supported is 4096 |
|
- max_time: Maximum runtime for the model (in seconds) |
|
- output_format: Output format, default is set to ``latex``, options include ``html`` and ``markdown`` |
|
- lmdeploy: Whether to use LMDeploy for deployment, currently set to False |
|
- flash_attn: Whether to use flash attention, only available for Ampere GPUs |
|
|
|
Diverse Input Support |
|
----------------- |
|
|
|
The table recognition script in PDF-Extract-Kit supports ``single table images`` and ``multiple table images`` as input. |
|
|
|
.. note:: |
|
|
|
The StructEqTable model only supports running on GPU devices |
|
|
|
.. note:: |
|
|
|
Adjust ``max_new_tokens`` and ``max_time`` according to the table content, defaults are 1024 and 30 respectively. |
|
|
|
.. note:: |
|
|
|
lmdeploy is an option for accelerated inference. If set to True, it will use LMDeploy for accelerated inference deployment. |
|
To use LMDeploy deployment, you need to install LMDeploy. For installation methods, refer to `LMDeploy <https://github.com/InternLM/lmdeploy>`_. |