ogai-8x7b-8bit-32k / README.md

Update README.md

8fa4c8b verified 12 days ago

7.03 kB

	---
	language: en
	license: mit
	tags:
	- ai
	- quantized
	- oil-and-gas
	- engineering
	- mixtral
	- causal-language-model
	- 8-bit
	library_name: transformers
	pipeline_tag: text-generation
	datasets:
	- custom-oil-and-gas-dataset
	model_type: mixtral
	model_name: OGAI-8x7B-8bit-32k
	inference:
	cloud_resources:
	recommended_resources:
	gpu_memory: 24GB+
	system_ram: 32GB+
	minimum_resources:
	gpu_memory: 16GB
	system_ram: 32GB
	is_quantized: true
	quantization_config:
	type: 8-bit
	precision: int8
	base_model: Mixtral-8x7B
	base_model_license: apache-2.0
	domain_specific:
	primary_domain: Oil & Gas Engineering
	key_capabilities:
	- Drilling Calculations
	- Well Trajectory Optimization
	- Hydraulics Computation
	- Technical Document Processing
	knowledge_limitation: Data available up to 2025
	---
	# OGAI-8x7B-8bit-32k: 8-bit Quantized Oil & Gas AI Model with Extended Context

	![Hugging Face](https://img.shields.io/badge/HuggingFace-OGAI--8x7B--8bit--32k-orange)
	[![License](https://img.shields.io/github/license/huggingface/transformers.svg)](LICENSE)

	## Model Description

	OGAI-8x7B-8bit-32k is an 8-bit quantized version of the OGAI-8x7B model with a 32K token context window. This quantized model retains most of the capabilities of the original model while significantly reducing memory requirements, making it ideal for deployment in environments with memory constraints.

	The model is based on a LoRA fine-tuned Mixtral-8x7B model, specifically engineered for oil and gas applications with a focus on drilling processes. The quantization to 8-bit precision offers a balanced approach between model size reduction and maintaining high-quality outputs for domain-specific tasks.

	- Developed by: GainEnergy AI Team
	- Model type: 8-bit Quantized Causal Language Model (Instruction Following)
	- Language: English
	- License: MIT
	- Finetuned from model: [GainEnergy/ogai-8x7b](https://huggingface.co/GainEnergy/ogai-8x7b)
	- Quantization method: 8-bit (Int8)
	- Context length: 32,768 tokens

	## Quantization Details

	This model was quantized from the full-precision OGAI-8x7B using 8-bit quantization with the following configuration:

	```python
	from transformers import BitsAndBytesConfig

	quantization_config = BitsAndBytesConfig(
	load_in_8bit=True,
	llm_int8_enable_fp32_cpu_offload=True
	)
	```

	The 8-bit quantization reduces the model size by approximately 4x compared to FP16, while preserving approximately 95-98% of the original model's performance on oil and gas engineering tasks.

	## Key Capabilities

	- Drilling Calculations & Optimization: Computes complex well trajectories, mud weight calculations, hydraulics, and casing designs.
	- Engineering Knowledge Integration: Retains knowledge from oil & gas technical literature, drilling reports, and proprietary engineering datasets.
	- Intelligent Document Processing: Supports knowledge retrieval for drilling workflows, regulatory compliance, and field operation manuals.
	- High-Context Reasoning: The extended 32K token context window allows the model to retain context across long drilling plans, technical discussions, and simulation outputs.

	## Usage

	### Basic Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	import torch

	# Configure quantization
	quantization_config = BitsAndBytesConfig(
	load_in_8bit=True,
	llm_int8_enable_fp32_cpu_offload=True
	)

	# Load tokenizer and model
	model_id = "GainEnergy/ogai-8x7b-8bit-32k"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	device_map="auto",
	quantization_config=quantization_config
	)

	# Example prompt for drilling engineering
	prompt = "Calculate the required casing depth for a well with a pore pressure of 12.5 ppg."
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=200)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### Utilizing the Extended Context Window

	To make the most of the 32K context window, you can input longer documents for analysis:

	```python
	# Load long document (e.g., drilling report, technical specifications)
	with open("long_drilling_report.txt", "r") as f:
	long_document = f.read()

	# Append a question at the end
	prompt = f"{long_document}\n\nBased on the above document, what are the key risk factors identified for this drilling operation?"

	# Process with appropriate truncation to fit within context
	inputs = tokenizer(
	prompt,
	return_tensors="pt",
	truncation=True,
	max_length=32000 # Leave room for generation
	).to(model.device)

	outputs = model.generate(
	**inputs,
	max_new_tokens=768,
	do_sample=True,
	temperature=0.7,
	top_p=0.9
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	## Hardware Requirements

	Due to the quantization, this model requires less GPU memory than the full-precision version:

	- Minimum: CUDA-capable GPU with 16GB VRAM
	- Recommended: CUDA-capable GPU with 24GB+ VRAM for comfortable usage with the 32K context window
	- System RAM: 32GB+

	## Limitations

	- Performance tradeoff: While 8-bit quantization preserves most capabilities, there may be slight reductions in accuracy for complex numerical computations compared to the full-precision model.
	- Domain specificity: The model is focused on oil and gas drilling engineering and may not perform well for other domains.
	- Expert validation: Outputs should be validated by domain experts before application in real-world engineering scenarios.
	- Knowledge cutoff: The model's knowledge is limited to data available up to 2025.

	## Comparison with Other Variants

	\| Model Variant \| Precision \| Context Length \| Memory Requirements \| Performance Retention \| Ideal Use Case \|
	\|-------------------\|---------------\|--------------------\|-----------------------\|--------------------------\|-------------------\|
	\| OGAI-8x7B \| Full (16-bit) \| 32K \| 64GB+ VRAM \| 100% (Baseline) \| High-precision engineering calculations \|
	\| OGAI-8x7B-8bit-32k \| 8-bit \| 32K \| 16-24GB VRAM \| ~95-98% \| Balanced approach for deployment \|
	\| OGAI-8x7B-4bit \| 4-bit (NF4) \| 32K \| 8-16GB VRAM \| ~90-95% \| Highly constrained environments \|

	## Citation

	```bibtex
	@article{ogai8x7b2025,
	title={OGAI-8x7B: An AI Model for Oil & Gas Drilling Engineering},
	author={GainEnergy AI Team},
	year={2025},
	publisher={Hugging Face Models}
	}
	```

	## Acknowledgments

	This model builds upon the work of the [OGAI-8x7B](https://huggingface.co/GainEnergy/ogai-8x7b) base model and extends its capabilities through quantization and context length expansion. Special thanks to the Mixtral team for the base architecture that powers this model.