Add some metadata to the model card
Browse filesHello!
## Preface
Nice work here! I'm excited to see models of this size for RAG. I'm going to dive more into your official library next.
## Pull Request overview
* Add `transformers` tag
* Add `text-generation` pipeline tag
## Details
These should make it a tad easier to find this model.
Also, as a heads up, in my experience, a small snippet in the model card with usage is often recommended to get people a good feel of what your model does, e.g.:
```python
rag = RAGWithCitations("PleIAs/Pleias-RAG-350M")
# Define query and sources
query = "What is the capital of France?"
sources = [
{
"text": "Paris is the capital and most populous city of France.",
"metadata": {"source": "Geographic Encyclopedia", "reliability": "high"}
},
{
"text": "The Eiffel Tower is located in Paris, France.",
"metadata": {"source": "Travel Guide", "year": 2020}
}
]
# Generate a response
response = rag.generate(query, sources)
# Print the final answer with citations
print(response["processed"]["clean_answer"])
```
(Except also include the import & printed output)
- Tom Aarsen
@@ -8,6 +8,9 @@ language:
|
|
8 |
- es
|
9 |
base_model:
|
10 |
- PleIAs/Pleias-350m-Preview
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
|
@@ -91,4 +94,4 @@ With only 350 million parameters, Pleias-RAG-350M is classified among the *phone
|
|
91 |
|
92 |
We also release an unquantized [GGUF version](https://huggingface.co/PleIAs/Pleias-RAG-350M-gguf) for deployment on CPU. Our internal performance benchmarks suggest that waiting times are currently acceptable for most either even under constrained RAM: about 20 seconds for a complex generation including reasoning traces on 8g RAM and below. Since the model is unquantized, quality of text generation should be identical to the original model.
|
93 |
|
94 |
-
Once integrated into a RAG system, Pleias-RAG-350M can also be use in a broader range of non-conversational use cases including user support or educational assistance. Through this release, we aims to make tiny model workable in production by relying systematically on an externalized memory.
|
|
|
8 |
- es
|
9 |
base_model:
|
10 |
- PleIAs/Pleias-350m-Preview
|
11 |
+
pipeline_tag: text-generation
|
12 |
+
tags:
|
13 |
+
- transformers
|
14 |
---
|
15 |
|
16 |
|
|
|
94 |
|
95 |
We also release an unquantized [GGUF version](https://huggingface.co/PleIAs/Pleias-RAG-350M-gguf) for deployment on CPU. Our internal performance benchmarks suggest that waiting times are currently acceptable for most either even under constrained RAM: about 20 seconds for a complex generation including reasoning traces on 8g RAM and below. Since the model is unquantized, quality of text generation should be identical to the original model.
|
96 |
|
97 |
+
Once integrated into a RAG system, Pleias-RAG-350M can also be use in a broader range of non-conversational use cases including user support or educational assistance. Through this release, we aims to make tiny model workable in production by relying systematically on an externalized memory.
|