Nexa-OLMo-sci7b / README.md
Allanatrix's picture
Update README.md
3376fe0 verified
metadata
license: apache-2.0
datasets:
  - Allanatrix/Scientific_Research_Tokenized
language:
  - en
base_model:
  - allenai/OLMo-7B
pipeline_tag: text-generation
library_name: peft
tags:
  - Olmo
  - lora
  - peft
  - transformers
  - scientific-ml
  - fine-tuned
  - research-assistant
  - hypothesis-generation
  - scientific-writing
  - scientific-reasoning

Model Card for nexa-OLMo-sci7b

Model Details

Model Description:
nexa-OLMo-sci7b is a fine-tuned variant of allenai/OLMo-7B, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using PEFT with LoRA in 4-bit quantized mode via bitsandbytes.

Developed by: Allan (Independent Scientific Intelligence Architect)
Shared by: Allan (https://huggingface.co/allan-wandia)
Model type: Decoder-only transformer (causal language model)
Language(s): English (scientific domain-specific vocabulary)
License: Apache 2.0
Fine-tuned from: allenai/OLMo-7B
Repository: https://huggingface.co/allan-wandia/nexa-olmo-sci7b

Training Details

Training Data:

  • Size: 100 million tokens
  • Source: Curated scientific literature (Bio, Physics, QST, Astro)

Hyperparameters:

  • Sequence length: 1024
  • Batch size: 1
  • Gradient Accumulation Steps: 64
  • Effective Batch Size: 64
  • Learning rate: 2e-05
  • Epochs: 2
  • LoRA: Enabled (PEFT)
  • Quantization: 4-bit

Results:
Robust performance in scientific prose tasks, with novelty varying by prompt diversity.