|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- Allanatrix/Scientific_Research_Tokenized |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- allenai/OLMo-7B |
|
|
pipeline_tag: text-generation |
|
|
library_name: peft |
|
|
tags: |
|
|
- Olmo |
|
|
- lora |
|
|
- peft |
|
|
- transformers |
|
|
- scientific-ml |
|
|
- fine-tuned |
|
|
- research-assistant |
|
|
- hypothesis-generation |
|
|
- scientific-writing |
|
|
- scientific-reasoning |
|
|
--- |
|
|
|
|
|
# Model Card for nexa-OLMo-sci7b |
|
|
|
|
|
## Model Details |
|
|
**Model Description:** |
|
|
nexa-OLMo-sci7b is a fine-tuned variant of allenai/OLMo-7B, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using PEFT with LoRA in 4-bit quantized mode via bitsandbytes. |
|
|
|
|
|
**Developed by:** Allan (Independent Scientific Intelligence Architect) |
|
|
**Shared by:** Allan (https://huggingface.co/allan-wandia) |
|
|
**Model type:** Decoder-only transformer (causal language model) |
|
|
**Language(s):** English (scientific domain-specific vocabulary) |
|
|
**License:** Apache 2.0 |
|
|
**Fine-tuned from:** allenai/OLMo-7B |
|
|
**Repository:** https://huggingface.co/allan-wandia/nexa-olmo-sci7b |
|
|
|
|
|
## Training Details |
|
|
**Training Data:** |
|
|
- Size: 100 million tokens |
|
|
- Source: Curated scientific literature (Bio, Physics, QST, Astro) |
|
|
|
|
|
**Hyperparameters:** |
|
|
- Sequence length: 1024 |
|
|
- Batch size: 1 |
|
|
- Gradient Accumulation Steps: 64 |
|
|
- Effective Batch Size: 64 |
|
|
- Learning rate: 2e-05 |
|
|
- Epochs: 2 |
|
|
- LoRA: Enabled (PEFT) |
|
|
- Quantization: 4-bit |
|
|
|
|
|
**Results:** |
|
|
Robust performance in scientific prose tasks, with novelty varying by prompt diversity. |