Formal Language T5 Model
This model is fine-tuned from T5-base for formal language correction and text formalization.
Model Description
- Model Type: T5-base fine-tuned
- Language: English
- Task: Text Formalization and Grammar Correction
- License: Apache 2.0
- Base Model: t5-base
Intended Uses & Limitations
Intended Uses
- Converting informal text to formal language
- Improving text professionalism
- Grammar correction
- Business communication enhancement
- Academic writing improvement
Limitations
- Works best with English text
- Maximum input length: 128 tokens
- May not preserve specific domain terminology
- Best suited for business and academic contexts
Usage
from transformers import AutoModelForSeq2SeqGeneration, AutoTokenizer
model = AutoModelForSeq2SeqGeneration.from_pretrained("renix-codex/formal-lang-rxcx-model")
tokenizer = AutoTokenizer.from_pretrained("renix-codex/formal-lang-rxcx-model")
# Example usage
text = "make formal: hey whats up"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
formal_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
Example Inputs and Outputs
Informal Input | Formal Output |
---|---|
"hey whats up" | "Hello, how are you?" |
"gonna be late for meeting" | "I will be late for the meeting." |
"this is kinda cool" | "This is quite impressive." |
Training
The model was trained on the Grammarly/COEDIT dataset with the following specifications:
- Base Model: T5-base
- Training Hardware: A100 GPU
- Sequence Length: 128 tokens
- Input Format: "make formal: [informal text]"
License
Apache License 2.0
Citation
@misc{formal-lang-rxcx-model,
author = {renix-codex},
title = {Formal Language T5 Model},
year = {2024},
publisher = {HuggingFace},
journal = {HuggingFace Model Hub},
url = {https://huggingface.co/renix-codex/formal-lang-rxcx-model}
}
Developer
Model developed by renix-codex
Ethical Considerations
This model is intended to assist in formal writing while maintaining the original meaning of the text. Users should be aware that:
- The model may alter the tone of personal or culturally specific expressions
- It should be used as a writing aid rather than a replacement for human judgment
- The output should be reviewed for accuracy and appropriateness
Updates and Versions
Initial Release - February 2024
- Base implementation with T5-base
- Trained on Grammarly/COEDIT dataset
- Optimized for formal language conversion
- Downloads last month
- 114
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for renix-codex/formal-lang-rxcx-model
Base model
google-t5/t5-baseDataset used to train renix-codex/formal-lang-rxcx-model
Evaluation results
- training_loss on grammarly/coeditself-reported2.100
- rouge1 on grammarly/coeditself-reported0.850
- accuracy on grammarly/coeditself-reported0.820