File size: 3,875 Bytes
43de7b0 bb76fde 4bedbab b34c319 4bedbab b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 4bedbab 43de7b0 b34c319 43de7b0 b34c319 43de7b0 4bedbab 43de7b0 b34c319 43de7b0 4bedbab 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 43de7b0 b34c319 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 |
---
base_model: TurkuNLP/gpt3-finnish-xl
license: apache-2.0
datasets:
- TurkuNLP/squad_v2_fi
language:
- fi
pipeline_tag: text-generation
---
# Model Card for Model Futurice/gpt3-finnish-xl-instruct
The model gpt3-finnish-xl-instruct is an instruction fine-tuned model intended for RAG type Q&A in Finnish.
## Model Details
### Model Description
The gpt3-finnish-xl-instruct model is based on TurkuNLP Finnish GPT-3-models. They are a model family of pretrained monolingual GPT-style language models, based on BLOOM-architecture.
The model was fine-tuned using a sample of dataset TurkuNLP/squad_v2_fi, that was DeepL translated from SQuAD2.0.
- **Developed by:** Martti Sutinen
- **Model type:** Bloom
- **Language(s) (NLP):** Finnish
- **License:** Apache-2.0
- **Finetuned from model:** TurkuNLP/gpt3-finnish-large
## Uses
Intended for RAG type Q&A in Finnish.
### Direct Use
Intended for text generation and RAG type Q&A in Finnish. Supply a context and ask a question about it.
### Out-of-Scope Use
Please do not misuse the model. Not recommended for other use cases.
## Bias, Risks, and Limitations
A key limitation is simple and limited selection of fine-tuning data. Please do not expect high quality answers.
### Recommendations
Recommeded to continue fine-tuning with more data or newer architecture.
## How to Get Started with the Model
- Recommended system message: "Olet avustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset."
- Recommended format for question about context: Tausta: "{context} \n\nKäytä vain taustaa ja vastaa kysymykseen tai tehtävään: {question}"
- Prompt format: tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
Where messages with typical format:
messages = [
{"role": "system", "content": system_message},
{"role": "user", "content": prompt_with_context}
].
Here is what the input could look like:
\<s><|im_start|>system
Olet avustaja. Seuraavaksi saat kysymyksen tai tehtävän. Kirjoita vastaus parhaasi mukaan siten että se täyttää kysymyksen tai tehtävän vaatimukset.<|im_end|>
<|im_start|>user
Tausta:
Dokumentti luotiin tammikuussa. Sen kirjoittajaa ei tunneta.
Käytä vain taustaa ja vastaa kysymykseen tai tehtävään: Milloin dokumentti kirjoitettiin?<|im_end|>
<|im_start|>assistant
Use pipeline with task text-generation and the recommended format.
## Training Details
### Training Data
Trained with 40000 random samples from test data in: [TurkuNLP/squad_v2_fi](https://huggingface.co/datasets/TurkuNLP/squad_v2_fi).
### Training Procedure
Training was done for 4-bit base model with supervised fine-tuning and Lora.
#### Training Hyperparameters
- **Training regime:** 4-bit, batch size 4, max steps 20000, data collator for completion only
## Evaluation
Evaluation has not been done properly yet.
### Testing Data, Factors & Metrics
#### Testing Data
Evaluated with 1000 random samples from test data in: [TurkuNLP/squad_v2_fi](https://huggingface.co/datasets/TurkuNLP/squad_v2_fi).
#### Factors
Same factors as in SQuAD2.0.
#### Metrics
Loss.
### Results
No results to be shared yet.
#### Summary
## Environmental Impact
Environmental impact not yet evaluated.
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** Mostly trained on A100
- **Hours used:** 5-10 hours
- **Cloud Provider:** GCP
- **Compute Region:** Unknown
- **Carbon Emitted:** Not evaluated
### Model Architecture and Objective
Bloom.
### Compute Infrastructure
Colab.
#### Hardware
1 x A100.
#### Software
Typical software used.
## Model Card Contact
Martti Sutinen
|