--- license: cc-by-nc-4.0 language: - en metrics: - accuracy base_model: - meta-llama/Llama-3.1-8B-Instruct --- # CoALM-8B: Conversational Agentic Language Model [![Made with Oumi](https://badgen.net/badge/Made%20with/Oumi/%23085CFF?icon=https%3A%2F%2Foumi.ai%2Flogo_dark.svg)](https://github.com/oumi-ai/oumi) ## Model Description **CoALM-8B** is the smallest open-source model of **CoALM** (Conversational Agentic Language Model) series, designed to integrate both **Task-Oriented Dialogue (TOD) capabilities** and **Language Agent (LA) functionalities** into a unified system. By fine-tuning on **CoALM-IT**, a novel dataset that interleaves multi-turn ReAct-based reasoning with complex API usage, CoALM-8B achieves promising results on TOD and function-calling benchmarks. CoALM-8B is trained on a **multi-task dataset** covering dialogue state tracking, function calling, and multi-turn reasoning. The model outperforms top domain-specific models on key evaluation benchmarks: **MultiWOZ 2.4 (TOD), BFCL V3 (LA), and API-Bank (LA).** ## Model Sources - πŸ“ **Paper:** https://arxiv.org/abs/2502.08820 - 🌐 **Project Page:** https://emrecanacikgoz.github.io/CoALM/ - πŸ’» **Repository:** https://github.com/oumi-ai/oumi/tree/main/configs/projects/calm - πŸ’Ž **Dataset:** https://huggingface.co/datasets/uiuc-convai/CoALM-IT --- ## Model Details - **Model Name:** CoALM-8B - **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi - **License:** cc-by-nc-4.0 - **Architecture:** Fine-tuned **Llama 3.1 8B Instruct** - **Training Data:** CoALM-IT dataset - **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi) - **Training Hardware:** 8 NVIDIA H100 GPUs - **Training Duration:** ~8 hours - **Evaluation Benchmarks:** MultiWOZ 2.4, BFCL V3, API-Bank - **Release Date:** February 5, 2025 --- ## Capabilities and Features ### πŸ—£ Conversational Agentic Abilities - **Multi-turn Dialogue Mastery:** Maintains coherent conversations across multiple turns with accurate state tracking. - **Function Calling and API Integration:** Dynamically selects and calls APIs for task execution. - **ReAct-based Reasoning:** Utilizes a structured reasoning process (User-Thought-Action-Observation-Thought-Response). - **Zero-Shot Generalization:** Excels in previously unseen function-calling tasks. ### πŸš€ Benchmark Performance - **MultiWOZ 2.4 (TOD):** Excels in dialogue state tracking and task completion. - **BFCL V3 (LA):** Demonstrates superior function-calling abilities over language agents. - **API-Bank (LA):** Accurately generates API calls and integrates responses into conversation flow. --- ## Training Process ### πŸ”§ Fine-tuning Stages 1. **TOD Fine-tuning:** Optimized for dialogue state tracking (e.g., augmented SNIPS reformatted in Alpaca-style instruction tuning). 2. **Function Calling Fine-tuning:** Trained to select and generate well-formed API calls from LA datasets. 3. **ReAct-based Fine-tuning:** Addresses multi-turn conversations with API integration using a structured reasoning framework. ### πŸ” Training Hyperparameters - **Base Model:** Llama 3.1 8B Instruct - **LoRA Config:** Rank = 16, Scaling Factor = 32 - **Batch Size:** 8 - **Learning Rate:** 1e-4 - **Optimizer:** AdamW (betas = 0.9, 0.999, epsilon = 1e-8) - **Precision:** Mixed precision (bfloat16) - **Warm-up Steps:** 0.1 ratio of total steps - **Gradient Accumulation Steps:** 1 --- ## πŸ’‘ CoALM-IT Dataset CALM-IT Dataset Statistics --- ## πŸ“Š Benchmark Performance CALM-IT Dataset Statistics --- ## Usage ### πŸ— How to Load the Model using Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CoALM-8B") model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CoALM-8B") ``` ### πŸ›  Example Oumi Inference ```bash pip install oumi # See oumi_infer.yaml in this model's /oumi/ directory. oumi infer -i -c ./oumi_infer.yaml ``` ### πŸ›  Example Oumi Fine-Tuning ```bash pip install oumi # See oumi_train.yaml in this model's /oumi/ directory. oumi train -c ./oumi_train.yaml ``` --- - **Task-Specific Calibration:** While CoALM-8B generalizes well across tasks, performance can improve with domain-specific fine-tuning. - **Scalability to Larger Models:** Future iterations (CoALM-70B, CoALM-405B) extend capabilities to larger-scale agentic conversations. - **Open-Source Expansion:** All datasets, training scripts, and model checkpoints are publicly available to foster further research. ## Acknowledgements We'd like to thank the [Oumi AI Team](https://github.com/oumi-ai/oumi) for collaborating on training the models using the Oumi platform on [Together AI's](https://www.together.ai/) cloud. ## License This model is licensed under [Creative Commons NonCommercial (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/legalcode). --- ## Citation If you use **CoALM-8B** in your research, please cite: ``` @misc{acikgoz2025singlemodelmastermultiturn, title={Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model}, author={Emre Can Acikgoz and Jeremiah Greer and Akul Datta and Ze Yang and William Zeng and Oussama Elachqar and Emmanouil Koukoumidis and Dilek Hakkani-TΓΌr and Gokhan Tur}, year={2025}, eprint={2502.08820}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2502.08820}, } ``` For more details, visit [Project Repository](https://github.com/oumi-ai/oumi/tree/main/configs/projects/coalm) or contact **acikgoz2@illinois.edu**.