Checkpoint for paper MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain
Our paper is accepted by EMNLP 2024 main conference as an oral presentation. The paper is available at arXiv.
This is the best medical sentence readability model trained on our dataset. This checkpoint uses a standard HuggingFace sentence prediction model.
Please find more details in our repo.
Quickstart on medical sentence readability model
# pip install transformers==4.35.2 torch --upgrade
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
MODEL_ID = "chaojiang06/medreadme_medical_sentence_readability_prediction_CWI"
MAX_LEN = 512
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForSequenceClassification.from_pretrained(
MODEL_ID,
trust_remote_code=True,
)
model.eval()
def score_sentences(sentences):
enc = tokenizer(
sentences,
padding=True, truncation=True, max_length=MAX_LEN,
return_tensors="pt"
)
with torch.no_grad():
out = model(**enc).logits.squeeze(-1) # shape: [batch]
return out.tolist()
print(score_sentences([
"Take one tablet by mouth twice daily after meals.",
"The pathophysiological sequelae of dyslipidemia necessitate..."
]))
Below are automatically generated by the Huggingface library.
roberta-large+cwi.py+512+8+1e-5+1
This model is a fine-tuned version of roberta-large on the cwi dataset. It achieves the following results on the evaluation set:
- Loss: 0.2137
- Pearsonr: 0.8429
- Addition Pearsonr: 0.8429
- Addition Pearsonr Pvalue: 0.0000
- Addition Spearmanr: 0.8297
- Addition Spearmanr Pvalue: 0.0000
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 1
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10.0
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.1+cu121
- Datasets 2.15.0
- Tokenizers 0.14.1
- Downloads last month
- 160
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for chaojiang06/medreadme_medical_sentence_readability_prediction_CWI
Base model
FacebookAI/roberta-large