ogai-8x7b-8bit-32k / README.md
tommytracx's picture
Update README.md
8fa4c8b verified
---
language: en
license: mit
tags:
- ai
- quantized
- oil-and-gas
- engineering
- mixtral
- causal-language-model
- 8-bit
library_name: transformers
pipeline_tag: text-generation
datasets:
- custom-oil-and-gas-dataset
model_type: mixtral
model_name: OGAI-8x7B-8bit-32k
inference:
cloud_resources:
recommended_resources:
gpu_memory: 24GB+
system_ram: 32GB+
minimum_resources:
gpu_memory: 16GB
system_ram: 32GB
is_quantized: true
quantization_config:
type: 8-bit
precision: int8
base_model: Mixtral-8x7B
base_model_license: apache-2.0
domain_specific:
primary_domain: Oil & Gas Engineering
key_capabilities:
- Drilling Calculations
- Well Trajectory Optimization
- Hydraulics Computation
- Technical Document Processing
knowledge_limitation: Data available up to 2025
---
# OGAI-8x7B-8bit-32k: 8-bit Quantized Oil & Gas AI Model with Extended Context
![Hugging Face](https://img.shields.io/badge/HuggingFace-OGAI--8x7B--8bit--32k-orange)
[![License](https://img.shields.io/github/license/huggingface/transformers.svg)](LICENSE)
## Model Description
**OGAI-8x7B-8bit-32k** is an **8-bit quantized version** of the OGAI-8x7B model with a **32K token context window**. This quantized model retains most of the capabilities of the original model while **significantly reducing memory requirements**, making it ideal for deployment in environments with memory constraints.
The model is based on a LoRA fine-tuned Mixtral-8x7B model, specifically engineered for oil and gas applications with a focus on drilling processes. The quantization to 8-bit precision offers a balanced approach between model size reduction and maintaining high-quality outputs for domain-specific tasks.
- **Developed by:** GainEnergy AI Team
- **Model type:** 8-bit Quantized Causal Language Model (Instruction Following)
- **Language:** English
- **License:** MIT
- **Finetuned from model:** [GainEnergy/ogai-8x7b](https://huggingface.co/GainEnergy/ogai-8x7b)
- **Quantization method:** 8-bit (Int8)
- **Context length:** 32,768 tokens
## Quantization Details
This model was quantized from the full-precision OGAI-8x7B using 8-bit quantization with the following configuration:
```python
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_8bit=True,
llm_int8_enable_fp32_cpu_offload=True
)
```
The 8-bit quantization reduces the model size by approximately 4x compared to FP16, while preserving approximately 95-98% of the original model's performance on oil and gas engineering tasks.
## Key Capabilities
- **Drilling Calculations & Optimization**: Computes complex well trajectories, mud weight calculations, hydraulics, and casing designs.
- **Engineering Knowledge Integration**: Retains knowledge from oil & gas technical literature, drilling reports, and proprietary engineering datasets.
- **Intelligent Document Processing**: Supports knowledge retrieval for drilling workflows, regulatory compliance, and field operation manuals.
- **High-Context Reasoning**: The extended 32K token context window allows the model to retain context across long drilling plans, technical discussions, and simulation outputs.
## Usage
### Basic Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
# Configure quantization
quantization_config = BitsAndBytesConfig(
load_in_8bit=True,
llm_int8_enable_fp32_cpu_offload=True
)
# Load tokenizer and model
model_id = "GainEnergy/ogai-8x7b-8bit-32k"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
quantization_config=quantization_config
)
# Example prompt for drilling engineering
prompt = "Calculate the required casing depth for a well with a pore pressure of 12.5 ppg."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Utilizing the Extended Context Window
To make the most of the 32K context window, you can input longer documents for analysis:
```python
# Load long document (e.g., drilling report, technical specifications)
with open("long_drilling_report.txt", "r") as f:
long_document = f.read()
# Append a question at the end
prompt = f"{long_document}\n\nBased on the above document, what are the key risk factors identified for this drilling operation?"
# Process with appropriate truncation to fit within context
inputs = tokenizer(
prompt,
return_tensors="pt",
truncation=True,
max_length=32000 # Leave room for generation
).to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=768,
do_sample=True,
temperature=0.7,
top_p=0.9
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Hardware Requirements
Due to the quantization, this model requires less GPU memory than the full-precision version:
- **Minimum:** CUDA-capable GPU with 16GB VRAM
- **Recommended:** CUDA-capable GPU with 24GB+ VRAM for comfortable usage with the 32K context window
- **System RAM:** 32GB+
## Limitations
- **Performance tradeoff:** While 8-bit quantization preserves most capabilities, there may be slight reductions in accuracy for complex numerical computations compared to the full-precision model.
- **Domain specificity:** The model is focused on oil and gas drilling engineering and may not perform well for other domains.
- **Expert validation:** Outputs should be validated by domain experts before application in real-world engineering scenarios.
- **Knowledge cutoff:** The model's knowledge is limited to data available up to 2025.
## Comparison with Other Variants
| **Model Variant** | **Precision** | **Context Length** | **Memory Requirements** | **Performance Retention** | **Ideal Use Case** |
|-------------------|---------------|--------------------|-----------------------|--------------------------|-------------------|
| OGAI-8x7B | Full (16-bit) | 32K | 64GB+ VRAM | 100% (Baseline) | High-precision engineering calculations |
| OGAI-8x7B-8bit-32k | 8-bit | 32K | 16-24GB VRAM | ~95-98% | Balanced approach for deployment |
| OGAI-8x7B-4bit | 4-bit (NF4) | 32K | 8-16GB VRAM | ~90-95% | Highly constrained environments |
## Citation
```bibtex
@article{ogai8x7b2025,
title={OGAI-8x7B: An AI Model for Oil & Gas Drilling Engineering},
author={GainEnergy AI Team},
year={2025},
publisher={Hugging Face Models}
}
```
## Acknowledgments
This model builds upon the work of the [OGAI-8x7B](https://huggingface.co/GainEnergy/ogai-8x7b) base model and extends its capabilities through quantization and context length expansion. Special thanks to the Mixtral team for the base architecture that powers this model.