Post
5876
I recently worked on a LoRA that improves tool use in LLM. Thought the approach might interest folks here.
The issue I have had when trying to use some of the local LLMs with coding agents is this:
Me: "Find all API endpoints with authentication in this codebase"
LLM: "You should look for @app .route decorators and check if they have auth middleware..."
But I often want it to search the files and show me but the LLM doesn't trigger a tool use call.
To fine-tune it for tool use I combined two data sources:
1. Magpie scenarios - 5000+ diverse tasks (bug hunting, refactoring, security audits)
2. Real execution - Ran these on actual repos (FastAPI, Django, React) to get authentic tool responses
This ensures the model learns both breadth (many scenarios) and depth (real tool behavior).
Tools We Taught:
-
-
-
-
-
-
Improvements:
- Tool calling accuracy: 12% → 80%
- Correct parameters: 8% → 87%
- Multi-step tasks: 3% → 78%
- End-to-end completion: 5% → 80%
- Tools per task: 0.2 → 3.8
The LoRA really improves on intential tool call as an example consider the query: "Find ValueError in payment module"
The response proceeds as follows:
1. Calls
2. Gets 4 matches across 3 files
3. Calls
4. Analyzes context
5. Reports: "Found 3 ValueError instances: payment/processor.py:47 for invalid amount, payment/validator.py:23 for unsupported currency..."
Resources:
- Colab notebook https://colab.research.google.com/github/codelion/ellora/blob/main/Ellora_Recipe_3_Enhanced_Tool_Calling_and_Code_Understanding.ipynb
- Model - codelion/Llama-3.2-1B-Instruct-tool-calling-lora
- GitHub - https://github.com/codelion/ellora
The issue I have had when trying to use some of the local LLMs with coding agents is this:
Me: "Find all API endpoints with authentication in this codebase"
LLM: "You should look for @app .route decorators and check if they have auth middleware..."
But I often want it to search the files and show me but the LLM doesn't trigger a tool use call.
To fine-tune it for tool use I combined two data sources:
1. Magpie scenarios - 5000+ diverse tasks (bug hunting, refactoring, security audits)
2. Real execution - Ran these on actual repos (FastAPI, Django, React) to get authentic tool responses
This ensures the model learns both breadth (many scenarios) and depth (real tool behavior).
Tools We Taught:
-
read_file
- Actually read file contents-
search_files
- Regex/pattern search across codebases-
find_definition
- Locate classes/functions-
analyze_imports
- Dependency tracking-
list_directory
- Explore structure-
run_tests
- Execute test suitesImprovements:
- Tool calling accuracy: 12% → 80%
- Correct parameters: 8% → 87%
- Multi-step tasks: 3% → 78%
- End-to-end completion: 5% → 80%
- Tools per task: 0.2 → 3.8
The LoRA really improves on intential tool call as an example consider the query: "Find ValueError in payment module"
The response proceeds as follows:
1. Calls
search_files
with pattern "ValueError"2. Gets 4 matches across 3 files
3. Calls
read_file
on each match4. Analyzes context
5. Reports: "Found 3 ValueError instances: payment/processor.py:47 for invalid amount, payment/validator.py:23 for unsupported currency..."
Resources:
- Colab notebook https://colab.research.google.com/github/codelion/ellora/blob/main/Ellora_Recipe_3_Enhanced_Tool_Calling_and_Code_Understanding.ipynb
- Model - codelion/Llama-3.2-1B-Instruct-tool-calling-lora
- GitHub - https://github.com/codelion/ellora