--- extra_gated_heading: >- Acknowledge to follow corresponding license to access the repository extra_gated_button_content: Agree and access repository extra_gated_fields: First Name: text Last Name: text Country: country Affiliation: text license: cc-by-nc-4.0 ---

xLAM

[AgentOhana Paper] | [Github] | [Discord] | [Homepage] | [Community Demo]


License: cc-by-nc-4.0 If you already know [Mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), xLAM-v0.1 is a significant upgrade and better at many things. For the same number of parameters, the model have been fine-tuned across a wide range of agent tasks and scenarios, all while preserving the capabilities of the original model. xLAM-v0.1-r represents the version 0.1 of the Large Action Model series, with the "-r" indicating it's tagged for research. This model is compatible with VLLM and FastChat platforms. | Model | # Total Params | Context Length |Release Date | Category | Download Model | Download GGUF files | |------------------------|----------------|----------------|----|----|----------------|----------| | xLAM-7b-r | 7.24B | 32k | Sep. 5, 2024|General, Function-calling | [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-7b-r) | -- | | xLAM-8x7b-r | 46.7B | 32k | Sep. 5, 2024|General, Function-calling | [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-8x7b-r) | -- | | xLAM-8x22b-r | 141B | 64k | Sep. 5, 2024|General, Function-calling | [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-8x22b-r) | -- | | xLAM-1b-fc-r | 1.35B | 16k | July 17, 2024 | Function-calling| [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r) | [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-1b-fc-r-gguf) | | xLAM-7b-fc-r | 6.91B | 4k | July 17, 2024| Function-calling| [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r) | [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-7b-fc-r-gguf) | | xLAM-v0.1-r | 46.7B | 32k | Mar. 18, 2024 |General, Function-calling | [πŸ€— Link](https://huggingface.co/Salesforce/xLAM-v0.1-r) | -- | ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("Salesforce/xLAM-v0.1-r") model = AutoModelForCausalLM.from_pretrained("Salesforce/xLAM-v0.1-r", device_map="auto") messages = [ {"role": "user", "content": "What is your favourite condiment?"}, {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"}, {"role": "user", "content": "Do you have mayonnaise recipes?"} ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda") outputs = model.generate(inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` You may need to tune the Temperature setting for different applications. Typically, a lower Temperature is helpful for tasks that require deterministic outcomes. Additionally, for tasks demanding adherence to specific formats or function calls, explicitly including formatting instructions is advisable. # Ethical Considerations This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high-risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our AUP and AI AUP. # Benchmarks ## [BOLAA](https://github.com/salesforce/BOLAA) ### Webshop
LLM NameZSZSTReaActPlanActPlanReActBOLAA
Llama-2-70B-chat 0.0089 0.01020.42730.28090.39660.4986
Vicuna-33B 0.1527 0.21220.19710.37660.40320.5618
Mixtral-8x7B-Instruct-v0.1 0.4634 0.45920.56380.47380.33390.5342
GPT-3.5-Turbo 0.4851 0.50580.50470.49300.54360.6354
GPT-3.5-Turbo-Instruct 0.3785 0.41950.43770.36040.48510.5811
GPT-4-06130.50020.4783 0.46160.79500.46350.6129
xLAM-v0.1-r0.52010.52680.64860.65730.66110.6556
### HotpotQA
LLM NameZSZSTReaActPlanActPlanReAct
Mixtral-8x7B-Instruct-v0.1 0.3912 0.39710.37140.31950.3039
GPT-3.5-Turbo 0.4196 0.39370.38680.41820.3960
GPT-4-06130.58010.5709 0.61290.57780.5716
xLAM-v0.1-r0.54920.47760.50200.55830.5030
## [AgentLite](https://github.com/SalesforceAIResearch/AgentLite/tree/main) **Please note:** All prompts provided by AgentLite are considered "unseen prompts" for xLAM-v0.1-r, meaning the model has not been trained with data related to these prompts. #### Webshop
LLM NameActReActBOLAA
GPT-3.5-Turbo-16k 0.6158 0.60050.6652
GPT-4-06130.6989 0.67320.7154
xLAM-v0.1-r0.65630.66400.6854
#### HotpotQA
EasyMediumHard
LLM NameF1 ScoreAccuracyF1 ScoreAccuracyF1 ScoreAccuracy
GPT-3.5-Turbo-16k-0613 0.410 0.3500.3300.250.2830.20
GPT-4-06130.6110.47 0.6100.4800.5270.38
xLAM-v0.1-r0.5320.450.5470.460.4550.36
## ToolBench
LLM NameUnseen Insts & Same SetUnseen Tools & Seen CatUnseen Tools & Unseen Cat
TooLlama V2 0.4385 0.43000.4350
GPT-3.5-Turbo-0125 0.5000 0.51500.4900
GPT-4-0125-preview0.54620.54500.5050
xLAM-v0.1-r0.50770.56500.5200
## [MINT-BENCH](https://github.com/xingyaoww/mint-bench)
LLM Name1-step2-step3-step4-step5-step
GPT-4-0613----69.45
Claude-Instant-112.1232.2539.2544.3745.90
xLAM-v0.1-r4.1028.5036.0142.6643.96
Claude-2 26.45 35.4936.0139.7639.93
Lemur-70b-Chat-v1 3.75 26.9635.6737.5437.03
GPT-3.5-Turbo-0613 2.7316.8924.0631.7436.18
AgentLM-70b 6.4817.7524.9128.1628.67
CodeLlama-34b 0.1716.2123.0425.9428.16
Llama-2-70b-chat 4.2714.3315.7016.5517.92
## [Tool-Query](https://github.com/hkust-nlp/AgentBoard)
LLM NameSuccess RateProgress Rate
xLAM-v0.1-r0.5330.766
DeepSeek-67B 0.400 0.714
GPT-3.5-Turbo-0613 0.367 0.627
GPT-3.5-Turbo-16k 0.3170.591
Lemur-70B 0.2830.720
CodeLlama-13B 0.2500.525
CodeLlama-34B 0.1330.600
Mistral-7B 0.0330.510
Vicuna-13B-16K 0.0330.343
Llama-2-70B 0.0000.483