|
--- |
|
language: en |
|
license: mit |
|
tags: |
|
- ai |
|
- quantized |
|
- oil-and-gas |
|
- engineering |
|
- mixtral |
|
- causal-language-model |
|
- 8-bit |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
datasets: |
|
- custom-oil-and-gas-dataset |
|
model_type: mixtral |
|
model_name: OGAI-8x7B-8bit-32k |
|
inference: |
|
cloud_resources: |
|
recommended_resources: |
|
gpu_memory: 24GB+ |
|
system_ram: 32GB+ |
|
minimum_resources: |
|
gpu_memory: 16GB |
|
system_ram: 32GB |
|
is_quantized: true |
|
quantization_config: |
|
type: 8-bit |
|
precision: int8 |
|
base_model: Mixtral-8x7B |
|
base_model_license: apache-2.0 |
|
domain_specific: |
|
primary_domain: Oil & Gas Engineering |
|
key_capabilities: |
|
- Drilling Calculations |
|
- Well Trajectory Optimization |
|
- Hydraulics Computation |
|
- Technical Document Processing |
|
knowledge_limitation: Data available up to 2025 |
|
--- |
|
# OGAI-8x7B-8bit-32k: 8-bit Quantized Oil & Gas AI Model with Extended Context |
|
|
|
 |
|
[](LICENSE) |
|
|
|
## Model Description |
|
|
|
**OGAI-8x7B-8bit-32k** is an **8-bit quantized version** of the OGAI-8x7B model with a **32K token context window**. This quantized model retains most of the capabilities of the original model while **significantly reducing memory requirements**, making it ideal for deployment in environments with memory constraints. |
|
|
|
The model is based on a LoRA fine-tuned Mixtral-8x7B model, specifically engineered for oil and gas applications with a focus on drilling processes. The quantization to 8-bit precision offers a balanced approach between model size reduction and maintaining high-quality outputs for domain-specific tasks. |
|
|
|
- **Developed by:** GainEnergy AI Team |
|
- **Model type:** 8-bit Quantized Causal Language Model (Instruction Following) |
|
- **Language:** English |
|
- **License:** MIT |
|
- **Finetuned from model:** [GainEnergy/ogai-8x7b](https://huggingface.co/GainEnergy/ogai-8x7b) |
|
- **Quantization method:** 8-bit (Int8) |
|
- **Context length:** 32,768 tokens |
|
|
|
## Quantization Details |
|
|
|
This model was quantized from the full-precision OGAI-8x7B using 8-bit quantization with the following configuration: |
|
|
|
```python |
|
from transformers import BitsAndBytesConfig |
|
|
|
quantization_config = BitsAndBytesConfig( |
|
load_in_8bit=True, |
|
llm_int8_enable_fp32_cpu_offload=True |
|
) |
|
``` |
|
|
|
The 8-bit quantization reduces the model size by approximately 4x compared to FP16, while preserving approximately 95-98% of the original model's performance on oil and gas engineering tasks. |
|
|
|
## Key Capabilities |
|
|
|
- **Drilling Calculations & Optimization**: Computes complex well trajectories, mud weight calculations, hydraulics, and casing designs. |
|
- **Engineering Knowledge Integration**: Retains knowledge from oil & gas technical literature, drilling reports, and proprietary engineering datasets. |
|
- **Intelligent Document Processing**: Supports knowledge retrieval for drilling workflows, regulatory compliance, and field operation manuals. |
|
- **High-Context Reasoning**: The extended 32K token context window allows the model to retain context across long drilling plans, technical discussions, and simulation outputs. |
|
|
|
## Usage |
|
|
|
### Basic Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
import torch |
|
|
|
# Configure quantization |
|
quantization_config = BitsAndBytesConfig( |
|
load_in_8bit=True, |
|
llm_int8_enable_fp32_cpu_offload=True |
|
) |
|
|
|
# Load tokenizer and model |
|
model_id = "GainEnergy/ogai-8x7b-8bit-32k" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
device_map="auto", |
|
quantization_config=quantization_config |
|
) |
|
|
|
# Example prompt for drilling engineering |
|
prompt = "Calculate the required casing depth for a well with a pore pressure of 12.5 ppg." |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate(**inputs, max_new_tokens=200) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
### Utilizing the Extended Context Window |
|
|
|
To make the most of the 32K context window, you can input longer documents for analysis: |
|
|
|
```python |
|
# Load long document (e.g., drilling report, technical specifications) |
|
with open("long_drilling_report.txt", "r") as f: |
|
long_document = f.read() |
|
|
|
# Append a question at the end |
|
prompt = f"{long_document}\n\nBased on the above document, what are the key risk factors identified for this drilling operation?" |
|
|
|
# Process with appropriate truncation to fit within context |
|
inputs = tokenizer( |
|
prompt, |
|
return_tensors="pt", |
|
truncation=True, |
|
max_length=32000 # Leave room for generation |
|
).to(model.device) |
|
|
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=768, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_p=0.9 |
|
) |
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
``` |
|
|
|
## Hardware Requirements |
|
|
|
Due to the quantization, this model requires less GPU memory than the full-precision version: |
|
|
|
- **Minimum:** CUDA-capable GPU with 16GB VRAM |
|
- **Recommended:** CUDA-capable GPU with 24GB+ VRAM for comfortable usage with the 32K context window |
|
- **System RAM:** 32GB+ |
|
|
|
## Limitations |
|
|
|
- **Performance tradeoff:** While 8-bit quantization preserves most capabilities, there may be slight reductions in accuracy for complex numerical computations compared to the full-precision model. |
|
- **Domain specificity:** The model is focused on oil and gas drilling engineering and may not perform well for other domains. |
|
- **Expert validation:** Outputs should be validated by domain experts before application in real-world engineering scenarios. |
|
- **Knowledge cutoff:** The model's knowledge is limited to data available up to 2025. |
|
|
|
## Comparison with Other Variants |
|
|
|
| **Model Variant** | **Precision** | **Context Length** | **Memory Requirements** | **Performance Retention** | **Ideal Use Case** | |
|
|-------------------|---------------|--------------------|-----------------------|--------------------------|-------------------| |
|
| OGAI-8x7B | Full (16-bit) | 32K | 64GB+ VRAM | 100% (Baseline) | High-precision engineering calculations | |
|
| OGAI-8x7B-8bit-32k | 8-bit | 32K | 16-24GB VRAM | ~95-98% | Balanced approach for deployment | |
|
| OGAI-8x7B-4bit | 4-bit (NF4) | 32K | 8-16GB VRAM | ~90-95% | Highly constrained environments | |
|
|
|
## Citation |
|
|
|
```bibtex |
|
@article{ogai8x7b2025, |
|
title={OGAI-8x7B: An AI Model for Oil & Gas Drilling Engineering}, |
|
author={GainEnergy AI Team}, |
|
year={2025}, |
|
publisher={Hugging Face Models} |
|
} |
|
``` |
|
|
|
## Acknowledgments |
|
|
|
This model builds upon the work of the [OGAI-8x7B](https://huggingface.co/GainEnergy/ogai-8x7b) base model and extends its capabilities through quantization and context length expansion. Special thanks to the Mixtral team for the base architecture that powers this model. |