Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ datasets:
|
|
5 |
---
|
6 |
A planner LLM [fine-tuned on synthetic trajectories](https://krasserm.github.io/2024/05/31/planner-fine-tuning/) from an agent simulation. It can be used in [ReAct](https://arxiv.org/abs/2210.03629)-style LLM agents where [planning is separated from function calling](https://krasserm.github.io/2024/03/06/modular-agent/). Trajectory generation and planner fine-tuning are described in the [bot-with-plan](https://github.com/krasserm/bot-with-plan) project.
|
7 |
|
8 |
-
The planner has been fine-tuned on the [krasserm/gba-trajectories](https://huggingface.co/datasets/krasserm/gba-trajectories) dataset. The original QLoRA
|
9 |
|
10 |
## Server setup
|
11 |
|
|
|
5 |
---
|
6 |
A planner LLM [fine-tuned on synthetic trajectories](https://krasserm.github.io/2024/05/31/planner-fine-tuning/) from an agent simulation. It can be used in [ReAct](https://arxiv.org/abs/2210.03629)-style LLM agents where [planning is separated from function calling](https://krasserm.github.io/2024/03/06/modular-agent/). Trajectory generation and planner fine-tuning are described in the [bot-with-plan](https://github.com/krasserm/bot-with-plan) project.
|
7 |
|
8 |
+
The planner has been fine-tuned on the [krasserm/gba-trajectories](https://huggingface.co/datasets/krasserm/gba-trajectories) dataset. The original QLoRA model is available at [krasserm/gba-planner-7B-v0.1](https://huggingface.co/krasserm/gba-planner-7B-v0.1).
|
9 |
|
10 |
## Server setup
|
11 |
|