-
PubTables-1M: Towards comprehensive table extraction from unstructured documents
Paper • 2110.00061 • Published • 2 -
Aligning benchmark datasets for table structure recognition
Paper • 2303.00716 • Published -
GriTS: Grid table similarity metric for table structure recognition
Paper • 2203.12555 • Published
Luiz G. A. Alves
lgaalves
AI & ML interests
Data Science, Time Series, Deep Learning, CV, and NLP
Organizations
mixture-of-experts
-
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Paper • 1701.06538 • Published • 7 -
Sparse Networks from Scratch: Faster Training without Losing Performance
Paper • 1907.04840 • Published • 3 -
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Paper • 1910.02054 • Published • 7 -
A Mixture of h-1 Heads is Better than h Heads
Paper • 2005.06537 • Published • 2
language-models
-
Mistral 7B
Paper • 2310.06825 • Published • 52 -
BloombergGPT: A Large Language Model for Finance
Paper • 2303.17564 • Published • 26 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 21 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 16
table-data-extraction
-
PubTables-1M: Towards comprehensive table extraction from unstructured documents
Paper • 2110.00061 • Published • 2 -
Aligning benchmark datasets for table structure recognition
Paper • 2303.00716 • Published -
GriTS: Grid table similarity metric for table structure recognition
Paper • 2203.12555 • Published
language-models
-
Mistral 7B
Paper • 2310.06825 • Published • 52 -
BloombergGPT: A Large Language Model for Finance
Paper • 2303.17564 • Published • 26 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 21 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 16
mixture-of-experts
-
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Paper • 1701.06538 • Published • 7 -
Sparse Networks from Scratch: Faster Training without Losing Performance
Paper • 1907.04840 • Published • 3 -
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Paper • 1910.02054 • Published • 7 -
A Mixture of h-1 Heads is Better than h Heads
Paper • 2005.06537 • Published • 2