KevinHuSh
commited on
Commit
·
e319829
1
Parent(s):
2d09c38
fix exception in pdf parser (#584)
Browse files### What problem does this PR solve?
#451
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
deepdoc/parser/pdf_parser.py
CHANGED
@@ -470,7 +470,8 @@ class RAGFlowPdfParser:
|
|
470 |
continue
|
471 |
|
472 |
if re.match(r"[0-9]{2,3}/[0-9]{3}$", up["text"]) \
|
473 |
-
or re.match(r"[0-9]{2,3}/[0-9]{3}$", down["text"])
|
|
|
474 |
i += 1
|
475 |
continue
|
476 |
|
|
|
470 |
continue
|
471 |
|
472 |
if re.match(r"[0-9]{2,3}/[0-9]{3}$", up["text"]) \
|
473 |
+
or re.match(r"[0-9]{2,3}/[0-9]{3}$", down["text"]) \
|
474 |
+
or not down["text"].strip():
|
475 |
i += 1
|
476 |
continue
|
477 |
|