ragflow / deepdoc /parser /pdf_parser.py

Commit History

Added static check at PR CI (#3921)
fe9b6b3

zhichyu commited on

Fix errors detected by Ruff (#3918)
0404a52

zhichyu commited on

Fix: page_chars attribute does not exist in some formats of PDF (#3796)
3c857ed

cyhasuka commited on

Fix out of boundary. (#3786)
1275b47

Kevin Hu commited on

Edit chunk shall update instead of insert it (#3709)
1b2aab6

zhichyu commited on

Added kb_id filter to knn. Fix #3458 (#3513)
aebd986

zhichyu commited on

Move settings initialization after module init phase (#3438)
6101699

jinhai-2012 commited on

Use consistent log file names, introduced initLogger (#3403)
8bc2fc9

zhichyu commited on

Rework logging (#3358)
22fe41e

zhichyu commited on

bigger resolution for OCR (#2919)
7b6220c

Kevin Hu commited on

fix: torch dependency start error (#2777)
0de98c4

chongcb chongchuanbing Kevin Hu commited on

Fix: renrank_model and pdf_parser bugs | Update: session API (#2601)
678763e

liuhua liuhua commited on

add lighten control (#2567)
dbcbb17

Kevin Hu commited on

fix parsing spaces in russian language PDFs (#1987) (#2427)
bac5213

Hyperb0t commited on

Fix docx parser line bug (#1715)
dda4c86

H Kevin Hu commited on

fix: When parsing the bold content in PDF, the result is duplicated. (#1729)
971f83c

leecjnew commited on

Fix pdfparser content confusion (#1700)
eec0415

H commited on

pypdf2 to pypdf (#1684)
10534c3

Kevin Hu commited on

fix bug about divided by zero (#1482)
4e6516f

Kevin Hu commited on

fix: Delete hardcode (#1464)
789efbc

Yuhao Tsui commited on

fix pdf_paser char content confusion (#1462)
1164cba

H commited on

fix pdf_parser content confusion (#1458)
ece4f03

H commited on

Fix occasional errors in pdf table recognition (#1277)
b4b278b

aopstudio commited on

add self-rag (#1070)
a49657b

KevinHuSh commited on

Update readme and add license (#1018)
9cba22c

jinhai-2012 commited on

fix bug in pdf parser (#986)
eefeab4

KevinHuSh commited on

fix #917 #915 (#946)
c61bcde

KevinHuSh commited on

fixbug for computing 'not concating feature' (#896)
368b624

Franker11 commited on

fix coordinate error (#686)
d684fd5

KevinHuSh commited on

remove PyMuPDF (#618)
5db8a67

KevinHuSh commited on

refine code (#595)
cfd6ece

KevinHuSh commited on

fix exception in pdf parser (#584)
e319829

KevinHuSh commited on

refactor code (#583)
2d09c38

KevinHuSh commited on

Refactor (#537)
3069c36

KevinHuSh commited on

enlarge docker memory usage (#501)
3cefaa0

KevinHuSh commited on

fix divide by zero bug (#447)
8f4f7c1

KevinHuSh commited on

Bug fix pdf parse index out of range (#440)
f77c02e

加帆 commited on

rm page number exception for pdf parser (#424)
0e9dd76

KevinHuSh commited on

make sure the models will not be load twice (#422)
c9a1362

KevinHuSh commited on

let's load model from local (#163)
8ee4f9f

KevinHuSh commited on

apply pep8 formalize (#155)
79ada0b

KevinHuSh commited on

support snapshot download from local (#153)
8f39e7a

KevinHuSh commited on

fix plainPdf bugs (#152)
328b4c9

KevinHuSh commited on

refine page ranges (#147)
7f98e24

KevinHuSh commited on

add use layout or not option (#145)
b085dec

KevinHuSh commited on

refine manual parser (#140)
004756c

KevinHuSh commited on

refine for English corpus (#135)
08bab63

KevinHuSh commited on

fix github account login issue (#132)
89444d3

KevinHuSh commited on

refine manul parser (#131)
7d85666

KevinHuSh commited on

add dockerfile for cuda envirement. Refine table search strategy, (#123)
9fe9fc4

KevinHuSh commited on