uiuc-convai
/

CoALM-405B

Text Generation

Safetensors

English

llama

conversational

Model card Files Files and versions Community

emrecanacikgoz commited on 22 days ago

Commit

073a7ee

verified ·

1 Parent(s): 22fea2f

Update README.md

Browse files

Files changed (1) hide show

README.md +16 -16

README.md CHANGED Viewed

@@ -9,13 +9,13 @@ base_model:
 pipeline_tag: text-generation
 ---
-# CALM-405B: The Largest Open-Source Agentic LLM
 [![Made with Oumi](https://badgen.net/badge/Made%20with/Oumi/%23085CFF?icon=https%3A%2F%2Foumi.ai%2Flogo_dark.svg)](https://github.com/oumi-ai/oumi)
 ## 🌟 Model Overview
-**CALM-405B** is the **largest fully open-source Conversational Agentic Language Model**. This model sets a new standard in **Conversational AI**, seamlessly integrating both **Task-Oriented Dialogue (TOD) capabilities** and **Language Agent (LA) functionalities**.
 It is designed to **push the boundaries** of open-source agentic LLMs, excelling at **multi-turn dialogue, tool usage, reasoning, and API execution**. It is the **best-performing fully open-source LLM** on the **Berkeley Function Calling Leaderboard V3 (BFCL V3)**, marking a leap in open-source AI research.
 ## Model Sources
@@ -23,19 +23,19 @@ It is designed to **push the boundaries** of open-source agentic LLMs, excelling
 <!-- Provide the basic links for the model. -->
 - 📝 **Paper:** https://arxiv.org/abs/2502.08820
-- 🌐 **Project Page:** https://emrecanacikgoz.github.io/CALM/
-- 💻 **Repository:** https://github.com/oumi-ai/oumi/tree/main/configs/projects/calm
-- 💎 **Dataset:** https://huggingface.co/datasets/uiuc-convai/CALM-IT
 ---
 ## 🚀 Model Details
-- **Model Name:** CALM-405B
 - **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi
 - **License:** cc-by-nc-4.0
 - **Architecture:** Meta-Llama 3.1-405B Instruct
-- **Training Data:** CALM-IT
 - **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi)
 - **Training Hardware:** 8 NVIDIA H100 GPUs
 - **Training Duration:** ~6.5 days
@@ -43,7 +43,7 @@ It is designed to **push the boundaries** of open-source agentic LLMs, excelling
 - **Release Date:** February 5, 2025
 ---
-## 🏆 Why CALM-405B is a Game-Changer
 - **🚨 Largest Open-Source Agentic LLM:** A **405B** parameter model that brings state-of-the-art agentic capabilities to the public domain.
 - **🎯 Best Open-Source Performance on BFCL V3:** Outperforms leading proprietary models like **GPT-4o, Gemini, and Claude** in function-calling tasks.
@@ -53,7 +53,7 @@ It is designed to **push the boundaries** of open-source agentic LLMs, excelling
 - **📜 Fully Open-Source & Reproducible:** Released under **cc-by-nc-4.0**, including model weights, training logs, and datasets.
-## 💡 CALM-IT Dataset
 <img src="table.png" alt="CALM-IT Dataset Statistics" width="800"/>
@@ -81,20 +81,20 @@ It is designed to **push the boundaries** of open-source agentic LLMs, excelling
 ---
-## ❗️ How to Use CALM-405B
 It requires 16xH100 NVIDIA GPUs for Inference.
 ### 🏗 How to Load the Model using HuggingFace
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CALM-8B")
-model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CALM-8B")
 ```
 ### 🛠 Example Oumi Inference
 Oumi multi-node inference support is under development.
-CALM-405B likely requires multi-node inference as most single nodes support up to 640GB of GPU VRAM.
 To run multi-node inference, we recommend [vLLM](https://docs.vllm.ai/en/latest/serving/distributed_serving.html).
 ### 🛠 Example Oumi Fine-Tuning
@@ -115,10 +115,10 @@ This model is licensed under [Creative Commons NonCommercial (CC BY-NC 4.0)](htt
 ---
 ## 📖 Citation
-If you use **CALM-405B** in your research, please cite:
 ```
 @misc{acikgoz2025singlemodelmastermultiturn,
-      title={Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model},
       author={Emre Can Acikgoz and Jeremiah Greer and Akul Datta and Ze Yang and William Zeng and Oussama Elachqar and Emmanouil Koukoumidis and Dilek Hakkani-Tür and Gokhan Tur},
       year={2025},
       eprint={2502.08820},
@@ -128,7 +128,7 @@ If you use **CALM-405B** in your research, please cite:
 }
 ```
-For more details, visit [Project Repository](https://github.com/oumi-ai/oumi/tree/main/configs/projects/calm) or contact **[email protected]**.

 pipeline_tag: text-generation
 ---
+# CoALM-405B: The Largest Open-Source Agentic LLM
 [![Made with Oumi](https://badgen.net/badge/Made%20with/Oumi/%23085CFF?icon=https%3A%2F%2Foumi.ai%2Flogo_dark.svg)](https://github.com/oumi-ai/oumi)
 ## 🌟 Model Overview
+**CoALM-405B** is the **largest fully open-source Conversational Agentic Language Model**. This model sets a new standard in **Conversational AI**, seamlessly integrating both **Task-Oriented Dialogue (TOD) capabilities** and **Language Agent (LA) functionalities**.
 It is designed to **push the boundaries** of open-source agentic LLMs, excelling at **multi-turn dialogue, tool usage, reasoning, and API execution**. It is the **best-performing fully open-source LLM** on the **Berkeley Function Calling Leaderboard V3 (BFCL V3)**, marking a leap in open-source AI research.
 ## Model Sources
 <!-- Provide the basic links for the model. -->
 - 📝 **Paper:** https://arxiv.org/abs/2502.08820
+- 🌐 **Project Page:** https://emrecanacikgoz.github.io/CoALM/
+- 💻 **Repository:** https://github.com/oumi-ai/oumi/tree/main/configs/projects/CALM
+- 💎 **Dataset:** https://huggingface.co/datasets/uiuc-convai/CoALM-IT
 ---
 ## 🚀 Model Details
+- **Model Name:** CoALM-405B
 - **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi
 - **License:** cc-by-nc-4.0
 - **Architecture:** Meta-Llama 3.1-405B Instruct
+- **Training Data:** CoALM-IT
 - **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi)
 - **Training Hardware:** 8 NVIDIA H100 GPUs
 - **Training Duration:** ~6.5 days
 - **Release Date:** February 5, 2025
 ---
+## 🏆 Why CoALM-405B is a Game-Changer
 - **🚨 Largest Open-Source Agentic LLM:** A **405B** parameter model that brings state-of-the-art agentic capabilities to the public domain.
 - **🎯 Best Open-Source Performance on BFCL V3:** Outperforms leading proprietary models like **GPT-4o, Gemini, and Claude** in function-calling tasks.
 - **📜 Fully Open-Source & Reproducible:** Released under **cc-by-nc-4.0**, including model weights, training logs, and datasets.
+## 💡 CoALM-IT Dataset
 <img src="table.png" alt="CALM-IT Dataset Statistics" width="800"/>
 ---
+## ❗️ How to Use CoALM-405B
 It requires 16xH100 NVIDIA GPUs for Inference.
 ### 🏗 How to Load the Model using HuggingFace
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CoALM-8B")
+model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CoALM-8B")
 ```
 ### 🛠 Example Oumi Inference
 Oumi multi-node inference support is under development.
+CoALM-405B likely requires multi-node inference as most single nodes support up to 640GB of GPU VRAM.
 To run multi-node inference, we recommend [vLLM](https://docs.vllm.ai/en/latest/serving/distributed_serving.html).
 ### 🛠 Example Oumi Fine-Tuning
 ---
 ## 📖 Citation
+If you use **CoALM-405B** in your research, please cite:
 ```
 @misc{acikgoz2025singlemodelmastermultiturn,
+      title={Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model},
       author={Emre Can Acikgoz and Jeremiah Greer and Akul Datta and Ze Yang and William Zeng and Oussama Elachqar and Emmanouil Koukoumidis and Dilek Hakkani-Tür and Gokhan Tur},
       year={2025},
       eprint={2502.08820},
 }
 ```
+For more details, visit [Project Repository](https://github.com/oumi-ai/oumi/tree/main/configs/projects/CALM) or contact **[email protected]**.