Layout detection mode?
Can be used only to extract bboxes with layouts labels? and no text, i mean use the model for layout detection ?
u_u
Hello Prudant,
It seems their intention is to integrate this model with the LayoutDetector
as part of their broader project called Docling. While exploring their repositories, I came across docling-ibm-models, which I believe will eventually be merged into the main project, allowing us to extract text and categorize the layout labels at the same time.
In that repository, you can find an example demonstrating how to detect the layout of a page (bounding boxes with layout labels) without extracting text. The usage is straightforward:
python -m demo.demo_layout_predictor -i <input_dir> -v <viz_dir>
Hello,
The SmolDocling model is already integrated in the docling project, please check our updated README. SmolDocling can be used as an alternative conversion path, which replaces all the other specific models we have in docling's standard pipeline. Using SmolDocling for layout analysis is possible but currently it won't be efficient for that, since the content is always produced as well. Training SmolDocling to output purely the structure tokens (including location) through a different query is part of our future plans and experimentation.
Thanks!