|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-Instruct |
|
--- |
|
|
|
|
|
# CALM-8B: Conversational Agentic Language Model |
|
|
|
## Model Description |
|
**CALM-8B** is the smallest open-source model of **CALM** (Conversational Agentic Language Model) series, designed to integrate both **Task-Oriented Dialogue (TOD) capabilities** and **Language Agent (LA) functionalities** into a unified system. By fine-tuning on **CALM-IT**, a novel dataset that interleaves multi-turn ReAct-based reasoning with complex API usage, CALM-8B achieves promising results on TOD and function-calling benchmarks. |
|
|
|
CALM-8B is trained on a **multi-task dataset** covering dialogue state tracking, function calling, and multi-turn reasoning. The model outperforms top proprietary and domain-specific models, including **GPT-4o**, on key evaluation benchmarks: **MultiWOZ 2.4 (TOD), BFCL V3 (LA), and API-Bank (LA).** |
|
|
|
## Model Sources [TODO] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Paper [optional]:** [More Information Needed] |
|
- **Repository:** [More Information Needed] |
|
|
|
|
|
--- |
|
## Model Details |
|
|
|
- **Model Name:** CALM-8B |
|
- **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi |
|
- **License:** Apache 2.0 |
|
- **Architecture:** Fine-tuned **Llama 3.1 8B Instruct** |
|
- **Training Data:** CALM-IT dataset |
|
- **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi) |
|
- **Training Hardware:** 8 NVIDIA H100 GPUs |
|
- **Training Duration:** ~8 hours |
|
- **Evaluation Benchmarks:** MultiWOZ 2.4, BFCL V3, API-Bank |
|
- **Release Date:** February 5, 2025 |
|
|
|
--- |
|
## Capabilities and Features |
|
|
|
### π£ Conversational Agentic Abilities |
|
- **Multi-turn Dialogue Mastery:** Maintains coherent conversations across multiple turns with accurate state tracking. |
|
- **Function Calling and API Integration:** Dynamically selects and calls APIs for task execution. |
|
- **ReAct-based Reasoning:** Utilizes a structured reasoning process (User-Thought-Action-Observation-Thought-Response). |
|
- **Zero-Shot Generalization:** Excels in previously unseen function-calling tasks. |
|
|
|
### π Benchmark Performance |
|
- **MultiWOZ 2.4 (TOD):** Excels in dialogue state tracking and task completion. |
|
- **BFCL V3 (LA):** Demonstrates superior function-calling abilities over language agents. |
|
- **API-Bank (LA):** Accurately generates API calls and integrates responses into conversation flow. |
|
|
|
--- |
|
## Training Process |
|
### π§ Fine-tuning Stages |
|
1. **TOD Fine-tuning:** Optimized for dialogue state tracking (e.g., augmented SNIPS reformatted in Alpaca-style instruction tuning). |
|
2. **Function Calling Fine-tuning:** Trained to select and generate well-formed API calls from LA datasets. |
|
3. **ReAct-based Fine-tuning:** Addresses multi-turn conversations with API integration using a structured reasoning framework. |
|
|
|
### π Training Hyperparameters |
|
- **Base Model:** Llama 3.1 8B Instruct |
|
- **LoRA Config:** Rank = 16, Scaling Factor = 32 |
|
- **Batch Size:** 8 |
|
- **Learning Rate:** 1e-4 |
|
- **Optimizer:** AdamW (betas = 0.9, 0.999, epsilon = 1e-8) |
|
- **Precision:** Mixed precision (bfloat16) |
|
- **Warm-up Steps:** 0.1 ratio of total steps |
|
- **Gradient Accumulation Steps:** 1 |
|
|
|
--- |
|
## Usage |
|
### π How to Load the Model using Transformers |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CALM-8B") |
|
model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CALM-8B") |
|
``` |
|
|
|
### π Example Oumi Inference |
|
```bash |
|
pip install oumi |
|
|
|
# See oumi_infer.yaml in this model's /oumi/ directory. |
|
oumi infer -i -c ./oumi_infer.yaml |
|
``` |
|
|
|
### π Example Oumi Fine-Tuning |
|
```bash |
|
pip install oumi |
|
|
|
# See oumi_train.yaml in this model's /oumi/ directory. |
|
oumi train -c ./oumi_train.yaml |
|
``` |
|
|
|
--- |
|
- **Task-Specific Calibration:** While CALM-8B generalizes well across tasks, performance can improve with domain-specific fine-tuning. |
|
- **Scalability to Larger Models:** Future iterations (CALM-70B, CALM-405B) extend capabilities to larger-scale agentic conversations. |
|
- **Open-Source Expansion:** All datasets, training scripts, and model checkpoints are publicly available to foster further research. |
|
|
|
## Acknowledgements |
|
We'd like to thank the [Oumi AI Team](https://github.com/oumi-ai/oumi) for collaborating on training the models, as well as [Together AI](https://www.together.ai/) for providing the compute resources necessary to train CALM 405B. |
|
|
|
## License |
|
This model is licensed under [Creative Commons NonCommercial (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/legalcode). |
|
|
|
<!-- TODO --> |
|
--- |
|
## Citation |
|
If you use **CALM-8B** in your research, please cite: |
|
``` |
|
@article{yourpaper2024, |
|
title={CALM: Conversational Agentic Language Model}, |
|
author={Your Name and Collaborators}, |
|
journal={Your Conference/Journal}, |
|
year={2025} |
|
} |
|
``` |
|
|
|
For more details, visit [Project Repository](https://github.com/your-repo) or contact **[email protected]**. |
|
|
|
|