File size: 2,418 Bytes
0214375 57ce8ee 0214375 57ce8ee 97f510a 57ce8ee 97f510a 57ce8ee 97f510a 57ce8ee |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
---
language:
- de
license: gpl
---
# Diversiformer
_Work in progress._
Language model for inclusive language in German, fine-tuned on [mT5](https://arxiv.org/abs/2010.11934).
An experimental model version is released [on Huggingface](https://huggingface.co/diversifix/diversiformer).
## Tasks
- **DETECT**: Recognizes instances of the generic masculine, and of other exclusive language. To do.
- **SUGGEST**: Suggest inclusive alternatives to masculine and exclusive words. To do.
- **REPLACE**: Replace one phrase by another, while preserving grammatical coherence. Work in progress.
- ▶️ `Ersetze "Schüler" durch "Schülerin oder Schüler": Die Schüler kamen zu spät.`
◀️ `Die Schülerinnen und Schüler kamen zu spät.`
- ▶️ `Ersetze "Lehrer" durch "Kollegium": Die wartenden Lehrer wunderten sich.`
◀️ `Das wartende Kollegium wunderte sich.`
## Usage
```python
from transformers import T5Tokenizer, TFT5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("google/mt5-small")
model = TFT5ForConditionalGeneration.from_pretrained("diversifix/diversiformer")
def generate(prompt, tokenizer, model):
tokenized_text = tokenizer.encode(prompt, return_tensors="tf")
ids = model.generate(tokenized_text, max_length=500)
output = tokenizer.decode(ids[0], skip_special_tokens=True)
return output
prompts = [
'Ersetze "Schüler" durch "Schülerin oder Schüler": Die Schüler kamen zu spät.',
'Ersetze "Lehrer" durch "Kollegium": Die wartenden Lehrer wunderten sich.',
]
for prompt in prompts:
output = generate(prompt, tokenizer, model)
print(f"{prompt}\n{output}\n\n")
```
## License
Diversiformer. Transformer model for inclusive language.
Copyright (C) 2022 [Diversifix e. V.](mailto:[email protected])
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
|