File size: 1,764 Bytes
9ef3417
 
 
 
f5778e7
 
9ef3417
 
 
 
f5778e7
9ef3417
f5778e7
9ef3417
 
f5778e7
9ef3417
f5778e7
 
 
 
 
9ef3417
f5778e7
 
 
 
 
 
9ef3417
 
 
f5778e7
9ef3417
f5778e7
9ef3417
a9e2f12
9ef3417
f5778e7
9ef3417
f5778e7
9ef3417
 
 
f5778e7
9ef3417
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
model-index:
- name: medieval-it5-base
  results: []
language:
- it
---

# medieval-it5-base

This model is a version of [gsarti/it5-base](https://huggingface.co/gsarti/it5-base) fine-tuned on a dataset called [ita2medieval](https://huggingface.co/datasets/leobertolazzi/ita2medieval). The Dataset contains sentences from medieval italian along with paraphrases in contemporary italian (approximately 6.5k pairs in total).

The fine-tuning task is text-style-tansfer from contemporary to medieval italian.


## Using the model

```
from transformers import AutoTokenzier, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("leobertolazzi/medieval-it5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("leobertolazzi/medieval-it5-base")
```

Flax and Tensorflow versions of the model are also available:
```
from transformers import FlaxT5ForConditionalGeneration, TFT5ForConditionalGeneration
model_flax = FlaxT5ForConditionalGeneration.from_pretrained("leobertolazzi/medieval-it5-base")
model_tf = TFT5ForConditionalGeneration.from_pretrained("leobertolazzi/medieval-it5-base")
```

## Training procedure

The code used for the fine-tuning is available in this [repo](https://github.com/leobertolazzi/medievalIT5)

## Intended uses & limitations

The biggest limitation for this project is the size of the ita2medieval dataset. In fact, it consists only of 6.5K sentence pairs whereas [gsarti/it5-base](https://huggingface.co/gsarti/it5-base) has 220M parameters.

For this reason the results can be far from perfect, but some nice style translations can also be obtained.

It would be nice to expand ita2medieval with text and paraphrases from more medieval italian authors!

### Framework versions

- Transformers 4.26.0
- Tokenizers 0.13.2