Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,62 @@ Give the model a passage and it will generate a question about the passage.
|
|
17 |
I used [flax summarization script](https://github.com/huggingface/transformers/tree/master/examples/flax/summarization) and a TPU v3-8. Summarization expects a text column and a summary column. For question generation training, use the context column instead of text column and question instead of summary column.
|
18 |
|
19 |
|
20 |
-
There is no guarantee that it will produce a question in the language of the passage, but it usually does.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
|
23 |
Model trained on Cloud TPUs from Google's TPU Research Cloud (TRC)
|
|
|
17 |
I used [flax summarization script](https://github.com/huggingface/transformers/tree/master/examples/flax/summarization) and a TPU v3-8. Summarization expects a text column and a summary column. For question generation training, use the context column instead of text column and question instead of summary column.
|
18 |
|
19 |
|
20 |
+
There is no guarantee that it will produce a question in the language of the passage, but it usually does. Lower resource languages will likely have lower quality questions.
|
21 |
+
|
22 |
+
|
23 |
+
## Using the model
|
24 |
+
|
25 |
+
#### PyTorch version
|
26 |
+
```python
|
27 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
28 |
+
|
29 |
+
tokenizer = AutoTokenizer.from_pretrained("nbroad/mt5-base-qgen")
|
30 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("nbroad/mt5-base-qgen", from_flax=True)
|
31 |
+
|
32 |
+
text = "Hugging Face has seen rapid growth in its \
|
33 |
+
popularity since the get-go. It is definitely doing\
|
34 |
+
the right things to attract more and more people to \
|
35 |
+
its platform, some of which are on the following lines:\
|
36 |
+
Community driven approach through large open source repositories \
|
37 |
+
along with paid services. Helps to build a network of like-minded\
|
38 |
+
people passionate about open source. \
|
39 |
+
Attractive price point. The subscription-based features, e.g.: \
|
40 |
+
Inference based API, starts at a price of $9/month.\
|
41 |
+
"
|
42 |
+
|
43 |
+
inputs = tokenizer(text, return_tensors="pt")
|
44 |
+
output = model.generate(**inputs, max_length=40)
|
45 |
+
|
46 |
+
tokenizer.decode(output[0], skip_special_tokens=True)
|
47 |
+
# What is Hugging Face's price point?
|
48 |
+
```
|
49 |
+
|
50 |
+
#### Flax version
|
51 |
+
```python
|
52 |
+
from transformers import AutoTokenizer, FlaxAutoModelForSeq2SeqLM
|
53 |
+
|
54 |
+
tokenizer = AutoTokenizer.from_pretrained("nbroad/mt5-base-qgen")
|
55 |
+
model = FlaxAutoModelForSeq2SeqLM.from_pretrained("nbroad/mt5-base-qgen")
|
56 |
+
|
57 |
+
text = "A un año y tres días de que el balón ruede \
|
58 |
+
en el Al Bayt Stadium inaugurando el Mundial 2022, \
|
59 |
+
ya se han dibujado los primeros bocetos de la próxima \
|
60 |
+
Copa del Mundo.13 selecciones están colocadas en el \
|
61 |
+
mapa con la etiqueta de clasificadas y tienen asegurado\
|
62 |
+
pisar los verdes de Qatar en la primera fase final \
|
63 |
+
otoñal. Serbia, Dinamarca, España, Países Bajos, \
|
64 |
+
Suiza, Croacia, Francia, Inglaterra, Bélgica, Alemania,\
|
65 |
+
Brasil, Argentina y Qatar, como anfitriona, entrarán en \
|
66 |
+
el sorteo del 1 de abril de 2022 en Doha en el que 32 \
|
67 |
+
países serán repartidos en sus respectivos grupos. \
|
68 |
+
"
|
69 |
+
|
70 |
+
inputs = tokenizer(text, return_tensors="pt")
|
71 |
+
output = model.generate(**inputs, max_length=40)
|
72 |
+
|
73 |
+
tokenizer.decode(output["sequences"][0], skip_special_tokens=True)
|
74 |
+
# ¿Cuántos países entrarán en el sorteo del Mundial 2022?
|
75 |
+
```
|
76 |
|
77 |
|
78 |
Model trained on Cloud TPUs from Google's TPU Research Cloud (TRC)
|