--- library_name: transformers tags: - aphasia - text-normalization - seq2seq - nlp --- # Model Card for Aphasia Text Normalization This is a fine-tuned model designed to normalize aphasic speech patterns into standard English, providing better communication capabilities for individuals with speech difficulties. ## Model Details ### Model Description - **Developed by:** Leif Rogers - **Shared by:** Leif Rogers - **Model type:** Seq2Seq Language Model - **Language(s):** English (EN) - **License:** Apache 2.0 - **Finetuned from:** T5-Small The model was fine-tuned on a synthetic dataset generated to mimic aphasic speech patterns and their normalized counterparts. It is intended for applications in assistive technologies to aid individuals with speech impairments. ### Model Sources - **Repository:** [GitHub Repo](https://github.com/leifsternyc/aphasiamodels) - **Paper:** Not applicable - **Demo:** Not available yet ## Uses ### Direct Use The model can be used directly for text normalization tasks to convert aphasic speech into standard English. ### Downstream Use Potential downstream uses include integration into assistive communication applications, healthcare tools, or educational resources for speech therapy. ### Out-of-Scope Use The model is not designed for: - Speech-to-text conversion - Non-English languages - Malicious applications (e.g., creating misleading outputs) ## Bias, Risks, and Limitations ### Bias The model was trained on synthetic data, which may not represent real-world variations in aphasic speech patterns. It could produce biased outputs for certain dialects or speech patterns. ### Risks - Overgeneralization of input - Misinterpretation of ambiguous input phrases ### Recommendations Users should evaluate the model’s performance in their specific use cases before deployment and provide manual oversight where necessary. ## How to Get Started with the Model Use the following code to load and use the model: ```python from transformers import AutoModelForSeq2SeqLM, AutoTokenizer model_name = "leifsternyc/aphasia-t5-normalization" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSeq2SeqLM.from_pretrained(model_name) # Example usage input_text = "Want go food need" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True))