root
commited on
Commit
·
7ac3947
1
Parent(s):
d73d642
Upload Neeto-1.0 8B model
Browse files- .gitattributes +1 -0
- README.md.save +0 -138
.gitattributes
CHANGED
@@ -33,5 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
36 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
37 |
*.jpg filter=lfs diff=lfs merge=lfs -text
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
*.json filter=lfs diff=lfs merge=lfs -text
|
37 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
38 |
*.jpg filter=lfs diff=lfs merge=lfs -text
|
README.md.save
DELETED
@@ -1,138 +0,0 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-nc-4.0
|
3 |
-
language:
|
4 |
-
- en
|
5 |
-
library_name: transformers
|
6 |
-
pipeline_tag: text-generation
|
7 |
-
tags:
|
8 |
-
- Text Generation
|
9 |
-
- medical
|
10 |
-
- fine-tuned
|
11 |
-
- biomedical
|
12 |
-
- Safetensors
|
13 |
-
- transformers
|
14 |
-
- BYOL-Academy
|
15 |
-
datasets:
|
16 |
-
- openlifescienceai/medmcqa
|
17 |
-
- GBaker/MedQA-USMLE-4-options-hf
|
18 |
-
- S4nfs/byolbane
|
19 |
-
- S4nfs/Medicoplasma
|
20 |
-
---
|
21 |
-
|
22 |
-
# Neeto-1.0-8b - A Specialized Medical LLM for NEET-PG/UKMLE/USMLE preparation
|
23 |
-
|
24 |
-

|
25 |
-
Neeto-1.0-8b is an openly released biomedical large language model (LLM) created by [BYOL Academy](https://byolacademy.com) to assist learners and practitioners with medical exam study, literature understanding, and structured clinical reasoning.
|
26 |
-
|
27 |
-
The model was adapted on a curated mixture (≈410K items) blending synthetic generations and hand-audited instructional / multiple‑choice / rationale samples. The objective was balanced: retain broad linguistic competence while strengthening factual recall, differential diagnostics framing, and question dissection for exams such as NEET‑PG, UKMLE, and USMLE.
|
28 |
-
|
29 |
-
Across widely used evaluation suites (MedQA, MedMCQA, PubMedQA, MMLU medical subsets), Neeto‑1.0‑8b attains strong 7B‑class results. Public benchmark numbers (table below) show it standing ahead of several prior open biomedical baselines of similar scale. The model will be used on our platform [Medicoplasma](https://medicoplasma.com) as for exam preparation and powering medical applications.
|
30 |
-
|
31 |
-
## How to Use
|
32 |
-
|
33 |
-
The model follows the default Llama‑3 chat message formatting (no explicit system prompt required). Provide a single user turn containing the question or case vignette; the model returns an answer (option selection, rationale, or free-form explanation depending on the prompt style).
|
34 |
-
|
35 |
-
Below are illustrative input patterns for multi‑choice items (MedQA / MedMCQA), PubMedQA‑style reasoning, and open clinical queries. For reproducibility of benchmark-style MCQ evaluation, keep choices clearly enumerated (A./B./C./D.) and avoid extra prose.
|
36 |
-
|
37 |
-
### Example (MedQA / MedMCQA style)
|
38 |
-
|
39 |
-
```
|
40 |
-
A 55-year-old male presents with sudden onset of severe unilateral flank pain radiating to the groin, accompanied by hematuria. Imaging reveals a calculus in the proximal ureter. Given the high prevalence of anatomical variations in the renal arteries and their proximity to the ureters, what is the primary clinical concern regarding surgical or interventional management of this patient's ureteral calculus, and which specific anatomical variation would most significantly complicate access or increase the risk of iatrogenic injury?
|
41 |
-
|
42 |
-
A. Aberrant accessory renal artery crossing the ureter, causing obstruction and risk of vascular injury during intervention.
|
43 |
-
B. Early bifurcation of the main renal artery within the hilum, increasing the risk of ureteral devascularization.
|
44 |
-
C. Dual renal veins draining into the inferior vena cava, raising concern for venous congestion during stone removal.
|
45 |
-
D. Persistent fetal renal lobulations that distort the renal pelvis and complicate stent placement.
|
46 |
-
```
|
47 |
-
|
48 |
-
### Inference with vLLM
|
49 |
-
|
50 |
-
```python
|
51 |
-
from transformers import AutoTokenizer
|
52 |
-
from vllm import LLM, SamplingParams
|
53 |
-
|
54 |
-
llm = LLM(model="S4nfs/Neeto-1.0-8b", trust_remote_code=True)
|
55 |
-
tokenizer = AutoTokenizer.from_pretrained("S4nfs/Neeto-1.0-8b")
|
56 |
-
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=1024, stop=["<|eot_id|>"])
|
57 |
-
|
58 |
-
messages = [
|
59 |
-
{"role": "user", "content": """The question format used in the above input examples。"""},
|
60 |
-
]
|
61 |
-
prompts = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
62 |
-
print(prompts[0])
|
63 |
-
"""
|
64 |
-
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
|
65 |
-
|
66 |
-
{question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
67 |
-
|
68 |
-
"""
|
69 |
-
|
70 |
-
outputs = llm.generate(prompts=prompts, sampling_params=sampling_params)
|
71 |
-
print(outputs[0].outputs[0].text)
|
72 |
-
```
|
73 |
-
|
74 |
-
Note: Current release is optimized for single‑turn exchanges. Multi‑turn conversational coherence will be improved in an upcoming iteration.
|
75 |
-
|
76 |
-
## Benchmark with Major Models
|
77 |
-
|
78 |
-

|
79 |
-
|
80 |
-
## Benchmark Snapshot Among Similar Size Models
|
81 |
-
|
82 |
-
Neeto‑1.0‑8b delivers the following published scores:
|
83 |
-
|
84 |
-
| Released Date | Model | Average | MedQA | MedMCQA | PubMedQA | MMLU.ck | MMLU.mg | MMLU.an | MMLU.pm | MMLU.cb | MMLU.cm |
|
85 |
-
| :-----------: | :-----------------: | :-----: | :---: | :-----: | :------: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
|
86 |
-
| 2025.08 | **Neeto-1.0-8b** | 87.87 | 87.8 | 66.2 | 79.0 | 79.4 | 90.1 | 79.1 | 95.6 | 81.4 | 78.6 |
|
87 |
-
| 2024.04 | OpenBioLM-8B | 72.48 | 59.0 | 56.9 | 74.1 | 76.1 | 86.1 | 69.8 | 78.2 | 84.2 | 68.0 |
|
88 |
-
| 2024.04 | Llama-3-8B-Instruct | 71.23 | 62.4 | 56.5 | 75.8 | 72.5 | 84.0 | 71.1 | 70.6 | 80.6 | 67.6 |
|
89 |
-
| 2024.04 | Internist-7B | 67.79 | 60.5 | 55.8 | 79.4 | 70.6 | 71.0 | 65.9 | 76.1 | - | 63.0 |
|
90 |
-
| 2024.02 | Gemma-7B | 64.18 | 47.2 | 49.0 | 76.2 | 69.8 | 70.0 | 59.3 | 66.2 | 79.9 | 60.1 |
|
91 |
-
| 2024.03 | Meerkat-7B | 63.94 | 74.3 | 60.7 | - | 61.9 | 70.4 | 61.5 | 69.5 | 55.4 | 57.8 |
|
92 |
-
| 2023.03 | MedAlpaca | 58.03 | 41.7 | 37.5 | 72.8 | 57.4 | 69.0 | 57.0 | 67.3 | 65.3 | 54.3 |
|
93 |
-
| 2024.02 | BioMistral-7B | 57.26 | 46.6 | 45.7 | 68.1 | 63.1 | 63.3 | 49.9 | 57.4 | 63.4 | 57.8 |
|
94 |
-
|
95 |
-
Interpretation & Methodology:
|
96 |
-
|
97 |
-
- MedQA uses the US 4‑option subset; MedMCQA uses the Dev split; PubMedQA reflects the “reasoning required” subset.
|
98 |
-
- MMLU medical grouping here incorporates: Clinical Knowledge (CK), Medical Genetics (MG), Anatomy (An), Professional Medicine (PM), College Biology (CB), College Medicine (CM).
|
99 |
-
- Greedy decoding was the baseline; ensemble self‑consistency scores (not shown) were generated via 10 samples (temperature 0.7, top_p 0.9) with majority voting.
|
100 |
-
- Comparative baselines partially sourced from the public Open Medical‑LLM Leaderboard.
|
101 |
-
|
102 |
-
## Training Configuration
|
103 |
-
|
104 |
-
Full‑parameter supervised fine‑tuning was executed under Fully Sharded Data Parallel (FSDP). Hardware: 8 × H200 GPUs (~29 hours wall time).
|
105 |
-
|
106 |
-
Hyperparameters:
|
107 |
-
|
108 |
-
- torch type: bfloat16
|
109 |
-
- epochs: 3
|
110 |
-
- learning rate: 2e-5
|
111 |
-
- learning rate scheduler type: cosine
|
112 |
-
- warmup ratio: 0.04
|
113 |
-
- max length: 1024
|
114 |
-
- global batch size: 128
|
115 |
-
|
116 |
-
## Limitations & Responsible Use
|
117 |
-
|
118 |
-
Despite strong benchmark standing, the model can hallucinate mechanistic explanations, mis-rank differential diagnoses, or fabricate citations. It must not be used for autonomous clinical decision-making, patient triage, prescribing, or emergency guidance. Human expert verification is mandatory before any medical action.
|
119 |
-
|
120 |
-
## Planned Enhancements
|
121 |
-
|
122 |
-
- Preference optimization (DPO) variants on forthcoming Llama releases.
|
123 |
-
- Expansion to JEE Advanced and NEET‑UG aligned scientific subject packs.
|
124 |
-
- Multi-turn dialogue memory and structured rationale modes.
|
125 |
-
- Integration within MedicoPlasma’s chat interface.
|
126 |
-
|
127 |
-
## Citation
|
128 |
-
|
129 |
-
```latex
|
130 |
-
@misc{Neeto-1.0-8b,
|
131 |
-
author = {Sagar Verma},
|
132 |
-
title = {NEETO: A Specialized Medical LLM for NEET-PG/UKMLE/USMLE preparation},
|
133 |
-
year = {2025},
|
134 |
-
publisher = {GitHub},
|
135 |
-
journal = {GitHub repository},
|
136 |
-
note = {\url{https://huggingface.co/S4nfs/Neeto-1.0-8b}},
|
137 |
-
}
|
138 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|