Banking AI Assistant - Llama 3.2 1B Fine-tuned

A specialized banking and financial AI assistant fine-tuned on the T2-RAGBench dataset for conversational RAG tasks. This model excels at analyzing financial documents, answering banking-related questions, and providing detailed insights from financial reports.

Model Details

  • Developed by: Akhenaton
  • Model Type: Causal Language Model (Llama 3.2 1B)
  • License: Apache 2.0
  • Base Model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Framework: Unsloth + Hugging Face TRL
  • Quantization: 4-bit (BitsAndBytes)

Training Details

Dataset

  • Source: G4KMU/t2-ragbench (ConvFinQA subset)
  • Size: 32,908 context-independent QA pairs from 9,000+ financial documents
  • Domains: FinQA, ConvFinQA, VQAonBD, TAT-DQA
  • Focus: Financial documents with text and tables from SEC filings

Training Configuration

LoRA Parameters:
  r: 16
  lora_alpha: 16
  lora_dropout: 0
  target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]

Training Setup:
  max_seq_length: 2048
  per_device_train_batch_size: 2
  gradient_accumulation_steps: 4
  max_steps: 60
  learning_rate: 2e-4
  optimizer: adamw_8bit
  lr_scheduler_type: cosine
  weight_decay: 0.01

Intended Use

Primary Use Cases

  • Financial Document Analysis: Extract insights from financial reports, SEC filings, and earnings statements
  • Banking Q&A: Answer questions about financial concepts, regulations, and banking operations
  • Conversational RAG: Provide context-aware responses based on financial document context
  • Financial Research: Assist with financial research and analysis tasks

Conversation Format

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a specialized banking AI assistant. Analyze financial documents and provide accurate, detailed answers based on the given context. Focus on numerical accuracy and financial terminology.<|eot_id|><|start_header_id|>user<|end_header_id|>

Financial Document Context:
{context}

Question: {question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{response}<|eot_id|>

Usage

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("Akhenaton/sft_banking_model")
tokenizer = AutoTokenizer.from_pretrained("Akhenaton/sft_banking_model")

# Prepare conversation
messages = [
    {"role": "user", "content": "Explain the key financial metrics in quarterly earnings."}
]

# Generate response
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=128, temperature=1.5, min_p=0.1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

With Unsloth (Recommended - 2x faster)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    "Akhenaton/sft_banking_model",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True
)
FastLanguageModel.for_inference(model)  # Enable fast inference

Available Formats

This model is available in multiple quantization formats:

  • q4_k_m: Recommended for most use cases
  • q8_0: Higher quality, more resource intensive
  • q5_k_m: Balanced quality and efficiency
  • f16: Full precision for maximum accuracy

Performance

  • Training Speed: 2x faster with Unsloth optimization
  • Memory Efficiency: 4-bit quantization reduces VRAM requirements
  • Inference Speed: Optimized for fast response generation
  • Accuracy: Specialized for financial domain with >80% context-independent Q&A capability

Limitations

  • Domain Specific: Optimized for financial/banking content, may have reduced performance on general topics
  • Training Size: Limited to 60 training steps - further training may improve performance
  • Context Length: Maximum sequence length of 2048 tokens
  • Language: English only
  • Numerical Reasoning: While improved for financial calculations, complex mathematical operations may require verification

Ethical Considerations

  • Financial Advice: This model should not be used as a substitute for professional financial advice
  • Data Source: Trained on public SEC filings and financial documents
  • Bias: May reflect biases present in financial reporting and documentation
  • Verification: Always verify numerical calculations and financial information from authoritative sources

Citation

If you use this model in your research or applications, please consider citing:

@misc{akhenaton2025sft_banking_model,
  author = {Akhenaton},
  title = {Banking AI Assistant - Llama 3.2 1B Fine-tuned},
  year = {2025},
  url = {https://huggingface.co/Akhenaton/sft_banking_model},
  note = {Fine-tuned with Unsloth on T2-RAGBench dataset}
}

Acknowledgments

  • Unsloth Team for the optimized training framework
  • Meta AI for the Llama 3.2 base model
  • G4KMU for the T2-RAGBench dataset
  • Hugging Face for the transformers library and model hosting

This model was trained 2x faster with Unsloth and Hugging Face's TRL library.

Downloads last month
112
Safetensors
Model size
1.24B params
Tensor type
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support