metadata

library_name: transformers
tags:
  - aphasia
  - text-normalization
  - seq2seq
  - nlp

Model Card for Aphasia Text Normalization

This is a fine-tuned model designed to normalize aphasic speech patterns into standard English, providing better communication capabilities for individuals with speech difficulties.

Model Details

Model Description

Developed by: Leif Rogers
Shared by: Leif Rogers
Model type: Seq2Seq Language Model
Language(s): English (EN)
License: Apache 2.0
Finetuned from: T5-Small

The model was fine-tuned on a synthetic dataset generated to mimic aphasic speech patterns and their normalized counterparts. It is intended for applications in assistive technologies to aid individuals with speech impairments.

Model Sources

Repository: GitHub Repo
Paper: Not applicable
Demo: Not available yet

Uses

Direct Use

The model can be used directly for text normalization tasks to convert aphasic speech into standard English.

Downstream Use

Potential downstream uses include integration into assistive communication applications, healthcare tools, or educational resources for speech therapy.

Out-of-Scope Use

The model is not designed for:

Speech-to-text conversion
Non-English languages
Malicious applications (e.g., creating misleading outputs)

Bias, Risks, and Limitations

Bias

The model was trained on synthetic data, which may not represent real-world variations in aphasic speech patterns. It could produce biased outputs for certain dialects or speech patterns.

Risks

Overgeneralization of input
Misinterpretation of ambiguous input phrases

Recommendations

Users should evaluate the model’s performance in their specific use cases before deployment and provide manual oversight where necessary.

How to Get Started with the Model

Use the following code to load and use the model:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "leifsternyc/aphasia-t5-normalization"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Example usage
input_text = "Want go food need"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))