You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

OGAI-8x7B-8bit-32k: 8-bit Quantized Oil & Gas AI Model with Extended Context

Hugging Face
License

Model Description

OGAI-8x7B-8bit-32k is an 8-bit quantized version of the OGAI-8x7B model with a 32K token context window. This quantized model retains most of the capabilities of the original model while significantly reducing memory requirements, making it ideal for deployment in environments with memory constraints.

The model is based on a LoRA fine-tuned Mixtral-8x7B model, specifically engineered for oil and gas applications with a focus on drilling processes. The quantization to 8-bit precision offers a balanced approach between model size reduction and maintaining high-quality outputs for domain-specific tasks.

  • Developed by: GainEnergy AI Team
  • Model type: 8-bit Quantized Causal Language Model (Instruction Following)
  • Language: English
  • License: MIT
  • Finetuned from model: GainEnergy/ogai-8x7b
  • Quantization method: 8-bit (Int8)
  • Context length: 32,768 tokens

Quantization Details

This model was quantized from the full-precision OGAI-8x7B using 8-bit quantization with the following configuration:

from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_8bit=True,
    llm_int8_enable_fp32_cpu_offload=True
)

The 8-bit quantization reduces the model size by approximately 4x compared to FP16, while preserving approximately 95-98% of the original model's performance on oil and gas engineering tasks.

Key Capabilities

  • Drilling Calculations & Optimization: Computes complex well trajectories, mud weight calculations, hydraulics, and casing designs.
  • Engineering Knowledge Integration: Retains knowledge from oil & gas technical literature, drilling reports, and proprietary engineering datasets.
  • Intelligent Document Processing: Supports knowledge retrieval for drilling workflows, regulatory compliance, and field operation manuals.
  • High-Context Reasoning: The extended 32K token context window allows the model to retain context across long drilling plans, technical discussions, and simulation outputs.

Usage

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

# Configure quantization
quantization_config = BitsAndBytesConfig(
    load_in_8bit=True,
    llm_int8_enable_fp32_cpu_offload=True
)

# Load tokenizer and model
model_id = "GainEnergy/ogai-8x7b-8bit-32k"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    quantization_config=quantization_config
)

# Example prompt for drilling engineering
prompt = "Calculate the required casing depth for a well with a pore pressure of 12.5 ppg."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Utilizing the Extended Context Window

To make the most of the 32K context window, you can input longer documents for analysis:

# Load long document (e.g., drilling report, technical specifications)
with open("long_drilling_report.txt", "r") as f:
    long_document = f.read()

# Append a question at the end
prompt = f"{long_document}\n\nBased on the above document, what are the key risk factors identified for this drilling operation?"

# Process with appropriate truncation to fit within context
inputs = tokenizer(
    prompt, 
    return_tensors="pt",
    truncation=True,
    max_length=32000  # Leave room for generation
).to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=768,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Hardware Requirements

Due to the quantization, this model requires less GPU memory than the full-precision version:

  • Minimum: CUDA-capable GPU with 16GB VRAM
  • Recommended: CUDA-capable GPU with 24GB+ VRAM for comfortable usage with the 32K context window
  • System RAM: 32GB+

Limitations

  • Performance tradeoff: While 8-bit quantization preserves most capabilities, there may be slight reductions in accuracy for complex numerical computations compared to the full-precision model.
  • Domain specificity: The model is focused on oil and gas drilling engineering and may not perform well for other domains.
  • Expert validation: Outputs should be validated by domain experts before application in real-world engineering scenarios.
  • Knowledge cutoff: The model's knowledge is limited to data available up to 2025.

Comparison with Other Variants

Model Variant Precision Context Length Memory Requirements Performance Retention Ideal Use Case
OGAI-8x7B Full (16-bit) 32K 64GB+ VRAM 100% (Baseline) High-precision engineering calculations
OGAI-8x7B-8bit-32k 8-bit 32K 16-24GB VRAM ~95-98% Balanced approach for deployment
OGAI-8x7B-4bit 4-bit (NF4) 32K 8-16GB VRAM ~90-95% Highly constrained environments

Citation

@article{ogai8x7b2025,
  title={OGAI-8x7B: An AI Model for Oil & Gas Drilling Engineering},
  author={GainEnergy AI Team},
  year={2025},
  publisher={Hugging Face Models}
}

Acknowledgments

This model builds upon the work of the OGAI-8x7B base model and extends its capabilities through quantization and context length expansion. Special thanks to the Mixtral team for the base architecture that powers this model.

Downloads last month
6
Safetensors
Model size
46.7B params
Tensor type
FP16
I8
F32
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.