ashishtanwer
's Collections
Dataset
updated
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
with Web Data, and Web Data Only
Paper
•
2306.01116
•
Published
•
33
Viewer
•
Updated
•
48.6B
•
438k
•
1.83k
Viewer
•
Updated
•
968M
•
20.4k
•
829
Preview
•
Updated
•
49.5k
•
449
LLaMA: Open and Efficient Foundation Language Models
Paper
•
2302.13971
•
Published
•
14
mosaicml/mpt-7b
Text Generation
•
Updated
•
30.4k
•
1.16k
togethercomputer/RedPajama-Data-V2
Updated
•
3.24k
•
359
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
Updated
•
654k
•
1.35k
Focus Anywhere for Fine-grained Multi-page Document Understanding
Paper
•
2405.14295
•
Published
•
1
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Paper
•
2312.06109
•
Published
•
21
💬
GOT Online
llava-hf/llava-1.5-7b-hf
Image-Text-to-Text
•
Updated
•
756k
•
224
microsoft/OmniParser
Image-Text-to-Text
•
Updated
•
1.61k
•
1.54k
ColPali: Efficient Document Retrieval with Vision Language Models
Paper
•
2407.01449
•
Published
•
43
InternLM-XComposer-2.5: A Versatile Large Vision Language Model
Supporting Long-Contextual Input and Output
Paper
•
2407.03320
•
Published
•
93
Viewer
•
Updated
•
1.45M
•
18.7k
•
176