FinChat-XS

FinChat-XS is a lightweight financial domain language model designed to answer questions about finance, markets, investments, and economics in a conversational style.

Model Overview

FinChat-XS is a fine-tuned version of HuggingFaceTB/SmolLM2-360M-Instruct, optimized for financial domain conversations using LoRA (Low-Rank Adaptation). With only 360M parameters, it offers a balance between performance and efficiency, making it accessible for deployment on consumer hardware.

The model combines professional financial knowledge with a conversational communication style, making it suitable for applications where users need expert financial information delivered in an approachable manner.

Repository & Resources

For full code, training process, and additional details, visit the GitHub repository:

🔗 FinLLMOpt Repository

How the Model was Created

FinChat-XS was developed through a focused fine-tuning process designed to enhance financial domain expertise while maintaining conversational abilities:

  1. Base model selection: Started with SmolLM2-360M-Instruct, a lightweight instruction-tuned language model

  2. Dataset preparation:

    • Filtered the sujet-ai/Sujet-Finance-Instruct-177k dataset to focus on QA and conversational QA examples
    • Applied length filtering to keep responses below 500 characters
    • Augmented short conversational QA examples to improve conciseness
  3. Fine-tuning approach:

    • Applied LoRA (Low-Rank Adaptation) to efficiently fine-tune the model
    • Targeted key attention modules (q_proj, v_proj)
    • Used rank r=4 and alpha=16
    • Training configuration:
      • Batch size: 2 (effective batch size 16 with gradient accumulation)
      • Learning rate: 1.5e-4
      • BF16 precision

Challenges

The primary challenge encountered during the development of FinChat-XS was the lack of high-quality conversational datasets specifically focused on personal finance. While the Sujet-Finance-Instruct-177k dataset provided valuable financial QA examples, there remains a notable gap in naturalistic, multi-turn conversations about personal financial scenarios.

Why Use This Model?

FinChat-XS offers several advantages for specific use cases:

  • Efficient deployment: At only 362MB, it can run on devices with limited resources.
  • Financial domain knowledge: Fine-tuned specifically on financial QA data
  • Balanced communication style: Combines professional financial knowledge with conversational delivery
  • Low deployment cost: Requires significantly less computational resources than larger models
  • Customizable: The LoRA adapter can be mixed with other adapters or further fine-tuned

Ideal for:

  • Embedded financial assistants in mobile apps
  • Personal financial planning tools
  • Educational applications about finance and investing
  • Customer service automation for financial institutions
  • Quick deployment scenarios where larger models aren't practical

How to Use the Model

Basic Usage with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "oopere/FinChat-XS"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)

# Create a conversation
messages = [
    {"role": "user", "content": "What's the difference between stocks and bonds?"}
]

# Format the prompt using the chat template
prompt = tokenizer.apply_chat_template(messages, tokenize=False)

# Tokenize the prompt
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate a response
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.2
)

# Decode and print the response
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Optimized Inference with 8-bit Quantization

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

# Configure 8-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_8bit=True,
    bnb_4bit_compute_dtype=torch.float16
)

# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
    "oopere/FinChat-XS", 
    quantization_config=bnb_config,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("oopere/FinChat-XS")

# Continue with the same usage pattern as above

Using with LoRA Adapter Only

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")

# Load LoRA adapter
peft_model = PeftModel.from_pretrained(base_model, "oopere/qa-adapterFinChat-XS")

# Continue with the same usage pattern as above

Limitations & Considerations

While FinChat-XS performs well in many financial conversation scenarios, users should be aware of these limitations:

  1. Knowledge limitations: The model's knowledge is limited to its training data and has a knowledge cutoff date from the base model (SmolLM2).

  2. Size trade-offs: As a 360M parameter model, it has less capacity than larger models (7B+) and may provide less nuanced or detailed responses on complex topics.

  3. Financial advice disclaimer: The model is not a certified financial advisor and should not be used for making investment decisions. Its responses should be considered educational, not professional financial advice.

  4. Domain boundaries: While focused on finance, the model may struggle with highly specialized financial topics or recent developments not covered in its training data.

  5. Hallucination potential: Like all language models, FinChat-XS may occasionally generate plausible-sounding but incorrect information, especially when asked about specific numerical data or complex financial details.

  6. Style variations: The model balances formal financial knowledge with a conversational style, which may not be appropriate for all professional contexts.

  7. Regulatory compliance: This model has not been specifically audited for compliance with financial regulations in various jurisdictions.

Citation

If you use FinChat-XS in your research or applications, please consider citing it as:

@misc{oopere2025finchatxs,
  author = {Martra, P.},
  title = {FinChat-XS: A Lightweight Financial Domain Chat Language Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/oopere/FinChat-XS}}
}

Acknowledgements

  • HuggingFaceTB for creating the SmolLM2 model series
  • Sujet AI for their financial instruction dataset
  • Hugging Face for providing the infrastructure and tools for model development
Downloads last month
167
Safetensors
Model size
362M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for oopere/FinChat-XS

Finetuned
(47)
this model
Quantizations
2 models

Dataset used to train oopere/FinChat-XS

Collection including oopere/FinChat-XS