Add Debugging Guide (#1089)
Browse files* add debug guide
* add background
* add .gitignore
* Update devtools/dev_sharegpt.yml
Co-authored-by: Wing Lian <[email protected]>
* Update docs/debugging.md
Co-authored-by: Wing Lian <[email protected]>
* simplify example axolotl config
* add additional comments
* add video and TOC
* try jsonc for better md rendering
* style video thumbnail better
* fix footnote
---------
Co-authored-by: Wing Lian <[email protected]>
- .gitignore +2 -0
- .vscode/README.md +1 -0
- .vscode/launch.json +34 -0
- .vscode/tasks.json +27 -0
- README.md +6 -1
- devtools/README.md +1 -0
- devtools/dev_sharegpt.yml +49 -0
- docs/debugging.md +165 -0
.gitignore
CHANGED
|
@@ -1,5 +1,7 @@
|
|
| 1 |
**/axolotl.egg-info
|
| 2 |
configs
|
|
|
|
|
|
|
| 3 |
|
| 4 |
# Byte-compiled / optimized / DLL files
|
| 5 |
__pycache__/
|
|
|
|
| 1 |
**/axolotl.egg-info
|
| 2 |
configs
|
| 3 |
+
last_run_prepared/
|
| 4 |
+
.vscode
|
| 5 |
|
| 6 |
# Byte-compiled / optimized / DLL files
|
| 7 |
__pycache__/
|
.vscode/README.md
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
See [docs/debugging.md](../docs/debugging.md) for guidance on how to modify these files to debug axolotl with VSCode.
|
.vscode/launch.json
ADDED
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
// Use IntelliSense to learn about possible attributes.
|
| 3 |
+
// Hover to view descriptions of existing attributes.
|
| 4 |
+
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
|
| 5 |
+
"version": "0.2.0",
|
| 6 |
+
"configurations": [
|
| 7 |
+
{
|
| 8 |
+
"name": "Debug axolotl prompt - sharegpt",
|
| 9 |
+
"type": "python",
|
| 10 |
+
"module": "accelerate.commands.launch",
|
| 11 |
+
"request": "launch",
|
| 12 |
+
"args": [
|
| 13 |
+
"-m", "axolotl.cli.train", "dev_sharegpt.yml",
|
| 14 |
+
// The flags below simplify debugging by overriding the axolotl config
|
| 15 |
+
// with the debugging tips above. Modify as needed.
|
| 16 |
+
"--dataset_processes=1", // limits data preprocessing to one process
|
| 17 |
+
"--max_steps=1", // limits training to just one step
|
| 18 |
+
"--batch_size=1", // minimizes batch size
|
| 19 |
+
"--micro_batch_size=1", // minimizes batch size
|
| 20 |
+
"--val_set_size=0", // disables validation
|
| 21 |
+
"--sample_packing=False", // disables sample packing which is necessary for small datasets
|
| 22 |
+
"--eval_sample_packing=False",// disables sample packing on eval set
|
| 23 |
+
"--dataset_prepared_path=temp_debug/axolotl_outputs/data", // send data outputs to a temp folder
|
| 24 |
+
"--output_dir=temp_debug/axolotl_outputs/model" // send model outputs to a temp folder
|
| 25 |
+
],
|
| 26 |
+
"console": "integratedTerminal", // show output in the integrated terminal
|
| 27 |
+
"cwd": "${workspaceFolder}/devtools", // set working directory to devtools from the root of the project
|
| 28 |
+
"justMyCode": true, // step through only axolotl code
|
| 29 |
+
"env": {"CUDA_VISIBLE_DEVICES": "0", // Since we aren't doing distributed training, we need to limit to one GPU
|
| 30 |
+
"HF_HOME": "${workspaceFolder}/devtools/temp_debug/.hf-cache"}, // send HF cache to a temp folder
|
| 31 |
+
"preLaunchTask": "cleanup-for-dataprep", // delete temp folders (see below)
|
| 32 |
+
}
|
| 33 |
+
]
|
| 34 |
+
}
|
.vscode/tasks.json
ADDED
|
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
//this file is used by launch.json
|
| 2 |
+
{
|
| 3 |
+
"version": "2.0.0",
|
| 4 |
+
"tasks": [
|
| 5 |
+
// this task changes into the devtools directory and deletes the temp_debug/axolotl_outputs folder
|
| 6 |
+
{
|
| 7 |
+
"label": "delete-outputs",
|
| 8 |
+
"type": "shell",
|
| 9 |
+
"command": "rm -rf temp_debug/axolotl_outputs",
|
| 10 |
+
"options":{ "cwd": "${workspaceFolder}/devtools"},
|
| 11 |
+
"problemMatcher": []
|
| 12 |
+
},
|
| 13 |
+
// this task changes into the devtools directory and deletes the `temp_debug/.hf-cache/datasets` folder
|
| 14 |
+
{
|
| 15 |
+
"label": "delete-temp-hf-dataset-cache",
|
| 16 |
+
"type": "shell",
|
| 17 |
+
"command": "rm -rf temp_debug/.hf-cache/datasets",
|
| 18 |
+
"options":{ "cwd": "${workspaceFolder}/devtools"},
|
| 19 |
+
"problemMatcher": []
|
| 20 |
+
},
|
| 21 |
+
// this task combines the two tasks above
|
| 22 |
+
{
|
| 23 |
+
"label": "cleanup-for-dataprep",
|
| 24 |
+
"dependsOn": ["delete-outputs", "delete-temp-hf-dataset-cache"],
|
| 25 |
+
}
|
| 26 |
+
]
|
| 27 |
+
}
|
README.md
CHANGED
|
@@ -39,6 +39,7 @@ Features:
|
|
| 39 |
- [Special Tokens](#special-tokens)
|
| 40 |
- [Common Errors](#common-errors-)
|
| 41 |
- [Tokenization Mismatch b/w Training & Inference](#tokenization-mismatch-bw-inference--training)
|
|
|
|
| 42 |
- [Need Help?](#need-help-)
|
| 43 |
- [Badge](#badge-)
|
| 44 |
- [Community Showcase](#community-showcase)
|
|
@@ -1066,7 +1067,7 @@ although this will be very slow, and using the config options above are recommen
|
|
| 1066 |
|
| 1067 |
## Common Errors 🧰
|
| 1068 |
|
| 1069 |
-
See also the [FAQ's](./docs/faq.md).
|
| 1070 |
|
| 1071 |
> If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
|
| 1072 |
|
|
@@ -1116,6 +1117,10 @@ If you decode a prompt constructed by axolotl, you might see spaces between toke
|
|
| 1116 |
|
| 1117 |
Having misalignment between your prompts during training and inference can cause models to perform very poorly, so it is worth checking this. See [this blog post](https://hamel.dev/notes/llm/05_tokenizer_gotchas.html) for a concrete example.
|
| 1118 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1119 |
## Need help? 🙋♂️
|
| 1120 |
|
| 1121 |
Join our [Discord server](https://discord.gg/HhrNrHJPRb) where we can help you
|
|
|
|
| 39 |
- [Special Tokens](#special-tokens)
|
| 40 |
- [Common Errors](#common-errors-)
|
| 41 |
- [Tokenization Mismatch b/w Training & Inference](#tokenization-mismatch-bw-inference--training)
|
| 42 |
+
- [Debugging Axolotl](#debugging-axolotl)
|
| 43 |
- [Need Help?](#need-help-)
|
| 44 |
- [Badge](#badge-)
|
| 45 |
- [Community Showcase](#community-showcase)
|
|
|
|
| 1067 |
|
| 1068 |
## Common Errors 🧰
|
| 1069 |
|
| 1070 |
+
See also the [FAQ's](./docs/faq.md) and [debugging guide](docs/debugging.md).
|
| 1071 |
|
| 1072 |
> If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
|
| 1073 |
|
|
|
|
| 1117 |
|
| 1118 |
Having misalignment between your prompts during training and inference can cause models to perform very poorly, so it is worth checking this. See [this blog post](https://hamel.dev/notes/llm/05_tokenizer_gotchas.html) for a concrete example.
|
| 1119 |
|
| 1120 |
+
## Debugging Axolotl
|
| 1121 |
+
|
| 1122 |
+
See [this debugging guide](docs/debugging.md) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode.
|
| 1123 |
+
|
| 1124 |
## Need help? 🙋♂️
|
| 1125 |
|
| 1126 |
Join our [Discord server](https://discord.gg/HhrNrHJPRb) where we can help you
|
devtools/README.md
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
This directory contains example config files that might be useful for debugging. Please see [docs/debugging.md](../docs/debugging.md) for more information.
|
devtools/dev_sharegpt.yml
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Example config for debugging the sharegpt prompt format
|
| 2 |
+
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
|
| 3 |
+
model_type: LlamaForCausalLM
|
| 4 |
+
tokenizer_type: LlamaTokenizer
|
| 5 |
+
is_llama_derived_model: true
|
| 6 |
+
|
| 7 |
+
load_in_8bit: true
|
| 8 |
+
load_in_4bit: false
|
| 9 |
+
|
| 10 |
+
datasets:
|
| 11 |
+
- path: philschmid/guanaco-sharegpt-style
|
| 12 |
+
type: sharegpt
|
| 13 |
+
shards: 10
|
| 14 |
+
val_set_size: 0
|
| 15 |
+
output_dir: temp_debug/axolotl_outputs/model
|
| 16 |
+
dataset_prepared_path: temp_debug/axolotl_outputs/data
|
| 17 |
+
dataset_processes: 1
|
| 18 |
+
|
| 19 |
+
sequence_len: 4096
|
| 20 |
+
sample_packing: false
|
| 21 |
+
pad_to_sequence_len: true
|
| 22 |
+
|
| 23 |
+
adapter: lora
|
| 24 |
+
lora_model_dir:
|
| 25 |
+
lora_r: 32
|
| 26 |
+
lora_alpha: 16
|
| 27 |
+
lora_dropout: 0.05
|
| 28 |
+
lora_target_linear: true
|
| 29 |
+
lora_fan_in_fan_out:
|
| 30 |
+
|
| 31 |
+
micro_batch_size: 1
|
| 32 |
+
num_epochs: 1
|
| 33 |
+
max_steps: 10
|
| 34 |
+
optimizer: adamw_bnb_8bit
|
| 35 |
+
lr_scheduler: cosine
|
| 36 |
+
learning_rate: 0.0002
|
| 37 |
+
|
| 38 |
+
train_on_inputs: false
|
| 39 |
+
group_by_length: false
|
| 40 |
+
bf16: false
|
| 41 |
+
fp16: true
|
| 42 |
+
tf32: false
|
| 43 |
+
|
| 44 |
+
gradient_checkpointing: true
|
| 45 |
+
logging_steps: 1
|
| 46 |
+
flash_attention: true
|
| 47 |
+
|
| 48 |
+
warmup_steps: 10
|
| 49 |
+
weight_decay: 0.0
|
docs/debugging.md
ADDED
|
@@ -0,0 +1,165 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Debugging Axolotl
|
| 2 |
+
|
| 3 |
+
This document provides some tips and tricks for debugging Axolotl. It also provides an example configuration for debugging with VSCode. A good debugging setup is essential to understanding how Axolotl code works behind the scenes.
|
| 4 |
+
|
| 5 |
+
## Table of Contents
|
| 6 |
+
|
| 7 |
+
- [General Tips](#general-tips)
|
| 8 |
+
- [Debugging with VSCode](#debugging-with-vscode)
|
| 9 |
+
- [Background](#background)
|
| 10 |
+
- [Configuration](#configuration)
|
| 11 |
+
- [Customizing your debugger](#customizing-your-debugger)
|
| 12 |
+
- [Video Tutorial](#video-tutorial)
|
| 13 |
+
|
| 14 |
+
## General Tips
|
| 15 |
+
|
| 16 |
+
While debugging it's helpful to simplify your test scenario as much as possible. Here are some tips for doing so:
|
| 17 |
+
|
| 18 |
+
> [!Important]
|
| 19 |
+
> All of these tips are incorporated into the [example configuration](#configuration) for debugging with VSCode below.
|
| 20 |
+
|
| 21 |
+
1. **Eliminate Concurrency**: Restrict the number of processes to 1 for both training and data preprocessing:
|
| 22 |
+
- Set `CUDA_VISIBLE_DEVICES` to a single GPU, ex: `export CUDA_VISIBLE_DEVICES=0`.
|
| 23 |
+
- Set `dataset_processes: 1` in your axolotl config or run the training command with `--dataset_processes=1`.
|
| 24 |
+
2. **Use a small dataset**: Construct or use a small dataset from HF Hub. When using a small dataset, you will often have to make sure `sample_packing: False` and `eval_sample_packing: False` to avoid errors. If you are in a pinch and don't have time to construct a small dataset but want to use from the HF Hub, you can shard the data (this will still tokenize the entire dataset, but will only use a fraction of the data for training. For example, to shard the dataset into 20 pieces, add the following to your axolotl config):
|
| 25 |
+
```yaml
|
| 26 |
+
dataset:
|
| 27 |
+
...
|
| 28 |
+
shards: 20
|
| 29 |
+
```
|
| 30 |
+
3. **Use a small model**: A good example of a small model is [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0).
|
| 31 |
+
4. **Minimize iteration time**: Make sure the training loop finishes as fast as possible, with these settings.
|
| 32 |
+
- `micro_batch_size: 1`
|
| 33 |
+
- `max_steps: 1`
|
| 34 |
+
- `val_set_size: 0`
|
| 35 |
+
5. **Clear Caches:** Axolotl caches certain steps and so does the underlying HuggingFace trainer. You may want to clear some of these caches when debugging.
|
| 36 |
+
- Data preprocessing: When debugging data preprocessing, which includes prompt template formation, you may want to delete the directory set in `dataset_prepared_path:` in your axolotl config. If you didn't set this value, the default is `last_run_prepared`.
|
| 37 |
+
- HF Hub: If you are debugging data preprocessing, you should clear the relevant HF cache [HuggingFace cache](https://huggingface.co/docs/datasets/cache), by deleting the appropriate `~/.cache/huggingface/datasets/...` folder(s).
|
| 38 |
+
- **The recommended approach is to redirect all outputs and caches to a temporary folder and delete selected subfolders before each run. This is demonstrated in the example configuration below.**
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
## Debugging with VSCode
|
| 42 |
+
|
| 43 |
+
### Background
|
| 44 |
+
|
| 45 |
+
The below example shows how to configure VSCode to debug data preprocessing of the `sharegpt` format. This is the format used when you have the following in your axolotl config:
|
| 46 |
+
|
| 47 |
+
```yaml
|
| 48 |
+
datasets:
|
| 49 |
+
- path: <path to your sharegpt formatted dataset> # example on HF Hub: philschmid/guanaco-sharegpt-style
|
| 50 |
+
type: sharegpt
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
>[!Important]
|
| 54 |
+
> If you are already familiar with advanced VSCode debugging, you can skip the below explanation and look at the files [.vscode/launch.json](../.vscode/launch.json) and [.vscode/tasks.json](../.vscode/tasks.json) for an example configuration.
|
| 55 |
+
|
| 56 |
+
>[!Tip]
|
| 57 |
+
> If you prefer to watch a video, rather than read, you can skip to the [video tutorial](#video-tutorial) below (but doing both is recommended).
|
| 58 |
+
|
| 59 |
+
### Configuration
|
| 60 |
+
|
| 61 |
+
The easiest way to get started is to modify the [.vscode/launch.json](../.vscode/launch.json) file in this project. This is just an example configuration, so you may need to modify or copy it to suit your needs.
|
| 62 |
+
|
| 63 |
+
For example, to mimic the command `cd devtools && CUDA_VISIBLE_DEVICES=0 accelerate launch -m axolotl.cli.train dev_sharegpt.yml`, you would use the below configuration[^1]. Note that we add additional flags that override the axolotl config and incorporate the tips above (see the comments). We also set the working directory to `devtools` and set the `env` variable `HF_HOME` to a temporary folder that is later partially deleted. This is because we want to delete the HF dataset cache before each run in this particular
|
| 64 |
+
|
| 65 |
+
```jsonc
|
| 66 |
+
// .vscode/launch.json
|
| 67 |
+
{
|
| 68 |
+
"version": "0.2.0",
|
| 69 |
+
"configurations": [
|
| 70 |
+
{
|
| 71 |
+
"name": "Debug axolotl prompt - sharegpt",
|
| 72 |
+
"type": "python",
|
| 73 |
+
"module": "accelerate.commands.launch",
|
| 74 |
+
"request": "launch",
|
| 75 |
+
"args": [
|
| 76 |
+
"-m", "axolotl.cli.train", "dev_sharegpt.yml",
|
| 77 |
+
// The flags below simplify debugging by overriding the axolotl config
|
| 78 |
+
// with the debugging tips above. Modify as needed.
|
| 79 |
+
"--dataset_processes=1", // limits data preprocessing to one process
|
| 80 |
+
"--max_steps=1", // limits training to just one step
|
| 81 |
+
"--batch_size=1", // minimizes batch size
|
| 82 |
+
"--micro_batch_size=1", // minimizes batch size
|
| 83 |
+
"--val_set_size=0", // disables validation
|
| 84 |
+
"--sample_packing=False", // disables sample packing which is necessary for small datasets
|
| 85 |
+
"--eval_sample_packing=False",// disables sample packing on eval set
|
| 86 |
+
"--dataset_prepared_path=temp_debug/axolotl_outputs/data", // send data outputs to a temp folder
|
| 87 |
+
"--output_dir=temp_debug/axolotl_outputs/model" // send model outputs to a temp folder
|
| 88 |
+
],
|
| 89 |
+
"console": "integratedTerminal", // show output in the integrated terminal
|
| 90 |
+
"cwd": "${workspaceFolder}/devtools", // set working directory to devtools from the root of the project
|
| 91 |
+
"justMyCode": true, // step through only axolotl code
|
| 92 |
+
"env": {"CUDA_VISIBLE_DEVICES": "0", // Since we aren't doing distributed training, we need to limit to one GPU
|
| 93 |
+
"HF_HOME": "${workspaceFolder}/devtools/temp_debug/.hf-cache"}, // send HF cache to a temp folder
|
| 94 |
+
"preLaunchTask": "cleanup-for-dataprep", // delete temp folders (see below)
|
| 95 |
+
}
|
| 96 |
+
]
|
| 97 |
+
}
|
| 98 |
+
```
|
| 99 |
+
|
| 100 |
+
**Additional notes about this configuration:**
|
| 101 |
+
|
| 102 |
+
- The argument `justMyCode` is set to `true` such that you step through only the axolotl code. If you want to step into dependencies, set this to `false`.
|
| 103 |
+
- The `preLaunchTask`: `cleanup-for-dataprep` is defined in [.vscode/tasks.json](../.vscode/tasks.json) and is used to delete the following folders before debugging, which is essential to ensure that the data pre-processing code is run from scratch:
|
| 104 |
+
- `./devtools/temp_debug/axolotl_outputs`
|
| 105 |
+
- `./devtools/temp_debug/.hf-cache/datasets`
|
| 106 |
+
|
| 107 |
+
>[!Tip]
|
| 108 |
+
> You may not want to delete these folders. For example, if you are debugging model training instead of data pre-processing, you may NOT want to delete the cache or output folders. You may also need to add additional tasks to the `tasks.json` file depending on your use case.
|
| 109 |
+
|
| 110 |
+
Below is the [./vscode/tasks.json](../.vscode/tasks.json) file that defines the `cleanup-for-dataprep` task. This task is run before each debugging session when you use the above configuration. Note how there are two tasks that delete the two folders mentioned above. The third task `cleanup-for-dataprep` is a composite task that combines the two tasks. A composite task is necessary because VSCode does not allow you to specify multiple tasks in the `preLaunchTask` argument of the `launch.json` file.
|
| 111 |
+
|
| 112 |
+
```jsonc
|
| 113 |
+
// .vscode/tasks.json
|
| 114 |
+
// this file is used by launch.json
|
| 115 |
+
{
|
| 116 |
+
"version": "2.0.0",
|
| 117 |
+
"tasks": [
|
| 118 |
+
// this task changes into the devtools directory and deletes the temp_debug/axolotl_outputs folder
|
| 119 |
+
{
|
| 120 |
+
"label": "delete-outputs",
|
| 121 |
+
"type": "shell",
|
| 122 |
+
"command": "rm -rf temp_debug/axolotl_outputs",
|
| 123 |
+
"options":{ "cwd": "${workspaceFolder}/devtools"},
|
| 124 |
+
"problemMatcher": []
|
| 125 |
+
},
|
| 126 |
+
// this task changes into the devtools directory and deletes the `temp_debug/.hf-cache/datasets` folder
|
| 127 |
+
{
|
| 128 |
+
"label": "delete-temp-hf-dataset-cache",
|
| 129 |
+
"type": "shell",
|
| 130 |
+
"command": "rm -rf temp_debug/.hf-cache/datasets",
|
| 131 |
+
"options":{ "cwd": "${workspaceFolder}/devtools"},
|
| 132 |
+
"problemMatcher": []
|
| 133 |
+
},
|
| 134 |
+
// this task combines the two tasks above
|
| 135 |
+
{
|
| 136 |
+
"label": "cleanup-for-dataprep",
|
| 137 |
+
"dependsOn": ["delete-outputs", "delete-temp-hf-dataset-cache"],
|
| 138 |
+
}
|
| 139 |
+
]
|
| 140 |
+
}
|
| 141 |
+
```
|
| 142 |
+
|
| 143 |
+
### Customizing your debugger
|
| 144 |
+
|
| 145 |
+
Your debugging use case may differ from the example above. The easiest thing to do is to put your own axolotl config in the `devtools` folder and modify the `launch.json` file to use your config. You may also want to modify the `preLaunchTask` to delete different folders or not delete anything at all.
|
| 146 |
+
|
| 147 |
+
### Video Tutorial
|
| 148 |
+
|
| 149 |
+
The following video tutorial walks through the above configuration and demonstrates how to debug with VSCode, (click the image below to watch):
|
| 150 |
+
|
| 151 |
+
<div style="text-align: center; line-height: 0;">
|
| 152 |
+
|
| 153 |
+
<a href="https://youtu.be/xUUB11yeMmc?si=z6Ea1BrRYkq6wsMx" target="_blank"
|
| 154 |
+
title="How to debug Axolotl (for fine tuning LLMs)"><img
|
| 155 |
+
src="https://i.ytimg.com/vi/xUUB11yeMmc/maxresdefault.jpg"
|
| 156 |
+
style="border-radius: 10px; display: block; margin: auto;" width="560" height="315" /></a>
|
| 157 |
+
|
| 158 |
+
<figcaption style="font-size: smaller;"><a href="https://hamel.dev">Hamel Husain's</a> tutorial: <a href="https://www.youtube.com/watch?v=xUUB11yeMmc">Debugging Axolotl w/VSCode</a></figcaption>
|
| 159 |
+
|
| 160 |
+
</div>
|
| 161 |
+
<br>
|
| 162 |
+
|
| 163 |
+
|
| 164 |
+
|
| 165 |
+
[^1]: The config actually mimics the command `CUDA_VISIBLE_DEVICES=0 python -m accelerate.commands.launch -m axolotl.cli.train devtools/sharegpt.yml`, but this is the same thing.
|