|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
### Generating Questions Given Context and Answers |
|
|
|
Traditional BART model is not pre-trained on QG tasks. We fine-tuned `facebook/bart-large` model using 55k human-created question answering pairs with contexts collected by [Demszky et al. (2018)](https://arxiv.org/abs/1809.02922). The dataset includes SQuAD and QA2D question answering pairs associated with contexts. |
|
|
|
### How to use |
|
Here is how to use this model in PyTorch: |
|
```python |
|
from transformers import BartForConditionalGeneration, BartTokenizer |
|
import torch |
|
|
|
tokenizer = BartTokenizer.from_pretrained('uzw/bart-large-question-generation') |
|
model = BartForConditionalGeneration.from_pretrained('uzw/bart-large-question-generation') |
|
|
|
context = "The Thug cult resides at the Pankot Palace." |
|
answer = "The Thug cult" |
|
|
|
inputs = tokenizer.encode_plus( |
|
context, |
|
answer, |
|
max_length=512, |
|
padding='max_length', |
|
truncation=True, |
|
return_tensors='pt' |
|
) |
|
|
|
with torch.no_grad(): |
|
generated_ids = model.generate( |
|
input_ids=inputs['input_ids'], |
|
attention_mask=inputs['attention_mask'], |
|
max_length=64, # Maximum length of generated question |
|
num_return_sequences=3, # Generate multiple questions |
|
do_sample=True, # Enable sampling for diversity |
|
temperature=0.7 # Control randomness of generation |
|
) |
|
|
|
generated_questions = tokenizer.batch_decode( |
|
generated_ids, |
|
skip_special_tokens=True |
|
) |
|
|
|
for i, question in enumerate(generated_questions, 1): |
|
print(f"Generated Question {i}: {question}") |
|
``` |
|
|
|
Adjusting parameter `num_return_sequences` to generate multiple questions. |