91 17 217

t.d.a.g. PRO

sequelbox

sequelbox.bsky.social

AI & ML interests

open source, infinite games. (they/them)

Recent Activity

liked a model 2 days ago

NousResearch/Hermes-4-405B

upvoted a paper 2 days ago

Hermes 4 Technical Report

reacted to codelion's post with 🔥 2 days ago

I recently worked on a LoRA that improves tool use in LLM. Thought the approach might interest folks here. The issue I have had when trying to use some of the local LLMs with coding agents is this: Me: "Find all API endpoints with authentication in this codebase" LLM: "You should look for @app.route decorators and check if they have auth middleware..." But I often want it to search the files and show me but the LLM doesn't trigger a tool use call. To fine-tune it for tool use I combined two data sources: 1. Magpie scenarios - 5000+ diverse tasks (bug hunting, refactoring, security audits) 2. Real execution - Ran these on actual repos (FastAPI, Django, React) to get authentic tool responses This ensures the model learns both breadth (many scenarios) and depth (real tool behavior). Tools We Taught: - `read_file` - Actually read file contents - `search_files` - Regex/pattern search across codebases - `find_definition` - Locate classes/functions - `analyze_imports` - Dependency tracking - `list_directory` - Explore structure - `run_tests` - Execute test suites Improvements: - Tool calling accuracy: 12% → 80% - Correct parameters: 8% → 87% - Multi-step tasks: 3% → 78% - End-to-end completion: 5% → 80% - Tools per task: 0.2 → 3.8 The LoRA really improves on intential tool call as an example consider the query: "Find ValueError in payment module" The response proceeds as follows: 1. Calls `search_files` with pattern "ValueError" 2. Gets 4 matches across 3 files 3. Calls `read_file` on each match 4. Analyzes context 5. Reports: "Found 3 ValueError instances: payment/processor.py:47 for invalid amount, payment/validator.py:23 for unsupported currency..." Resources: - Colab notebook https://colab.research.google.com/github/codelion/ellora/blob/main/Ellora_Recipe_3_Enhanced_Tool_Calling_and_Code_Understanding.ipynb - Model - https://huggingface.co/codelion/Llama-3.2-1B-Instruct-tool-calling-lora - GitHub - https://github.com/codelion/ellora

View all activity

Organizations

reacted to codelion's post with 🔥 2 days ago

Post

5876

I recently worked on a LoRA that improves tool use in LLM. Thought the approach might interest folks here.

The issue I have had when trying to use some of the local LLMs with coding agents is this:

Me: "Find all API endpoints with authentication in this codebase"
LLM: "You should look for @app .route decorators and check if they have auth middleware..."

But I often want it to search the files and show me but the LLM doesn't trigger a tool use call.

To fine-tune it for tool use I combined two data sources:

1. Magpie scenarios - 5000+ diverse tasks (bug hunting, refactoring, security audits)
2. Real execution - Ran these on actual repos (FastAPI, Django, React) to get authentic tool responses

This ensures the model learns both breadth (many scenarios) and depth (real tool behavior).

Tools We Taught:
- read_file - Actually read file contents
- search_files - Regex/pattern search across codebases
- find_definition - Locate classes/functions
- analyze_imports - Dependency tracking
- list_directory - Explore structure
- run_tests - Execute test suites

Improvements:
- Tool calling accuracy: 12% → 80%
- Correct parameters: 8% → 87%
- Multi-step tasks: 3% → 78%
- End-to-end completion: 5% → 80%
- Tools per task: 0.2 → 3.8

The LoRA really improves on intential tool call as an example consider the query: "Find ValueError in payment module"

The response proceeds as follows:

1. Calls search_files with pattern "ValueError"
2. Gets 4 matches across 3 files
3. Calls read_file on each match
4. Analyzes context
5. Reports: "Found 3 ValueError instances: payment/processor.py:47 for invalid amount, payment/validator.py:23 for unsupported currency..."

Resources:
- Colab notebook https://colab.research.google.com/github/codelion/ellora/blob/main/Ellora_Recipe_3_Enhanced_Tool_Calling_and_Code_Understanding.ipynb
- Model - codelion/Llama-3.2-1B-Instruct-tool-calling-lora
- GitHub - https://github.com/codelion/ellora

reacted to danielhanchen's post with ❤️ 9 days ago

Post

4435

Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!🐋
GGUFs: unsloth/DeepSeek-V3.1-GGUF

The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.

The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.

Guide: https://docs.unsloth.ai/basics/deepseek-v3.1

reacted to Akhil-Theerthala's post with ❤️ 21 days ago

Post

2077

I'm excited to announce that I've just released the newest versions of my Kuvera models and the expanded Personal Finance Reasoning dataset on Hugging Face!

What's new:
I've expanded the Personal Finance Reasoning Dataset, which now includes 18.9k samples of real-world financial questions paired with detailed, empathetic answers. The previous generation pipeline was also streamlined with better psychological context and response validations.

I've also released new Kuvera models trained on this improved dataset:
- Kuvera-4B & 8B: These are my upgraded non-reasoning models, fine-tuned to provide practical financial advice. I've specifically trained the 8B model to better understand the user's emotional context.
- Kuvera-12B: A first experimental reasoning model focused on the query resolution.

As the sole person working on this project, this release is a noticeable step forward from my previous work, offering more powerful and nuanced tools for financial AI.

I am actively looking to collaborate with others who are passionate about analyzing and improving the quality of personal finance advice generated by large language models. If this sounds like you, please reach out!

You can check these out on the following links:

Models:
- Akhil-Theerthala/Kuvera-8B-qwen3-v0.2.1
- Akhil-Theerthala/Kuvera-4B-unsloth-gemma3
- Akhil-Theerthala/kuvera-12B-v0.2.0-unsloth-gemma3

Dataset:
- Akhil-Theerthala/Kuvera-PersonalFinance-V2.1

P.S. The paper on the framework used to generate these models along with the detailed evaluation of the main 8B model's responses is going to be released soon!

2 replies

posted an update 21 days ago

Post

422

We've brought DAG Reasoning to gpt-oss-20b and Qwen3-4B-Thinking-2507!

- DAG Reasoning is the first model in our Experimental Reasoning Modalities series: use it to create structured, analytical Directed Acyclic Graphs to provide insight into your queries and situations!
- Multi-step analysis identifies causal relationships, produces confidence measurements, and forms a single structured graph object.
- DAG Reasoning Format provides clear, readable JSON containing structured, useful information; easy to use for creating visualizations, doing analysis, or further conversation with your assistant.
- Trained in a variety of subjects for flexible analysis: programming, science, business, economics, finance, law, logistics, management, and more!

Our newest versions of DAG Reasoning are available now!
Get gpt-oss-20b: sequelbox/gpt-oss-20b-DAG-Reasoning
Get Qwen3-4B-Thinking-2507: sequelbox/Qwen3-4B-Thinking-2507-DAG-Reasoning

You can also get the DAG Reasoning dataset, to train your own models to use DAG Reasoning Format: sequelbox/DAG-Reasoning-DeepSeek-R1-0528

Support our experimental open-source research efforts, models and datasets: sequelbox/SupportOpenSource

Our upcoming releases, coming soon with your support:
- bringing Shining Valiant 3 to the Qwen 3 2507 series!
- our next release in the Experimental Reasoning Modalities series - we're hard at work on this right now!
- we'll be upgrading the Esper line with Esper 3.1 - newer and better datasets, asking tougher and deeper coding, DevOps, and architecture questions, plus improvements to general chat!

with love and appreciation,
allegra

1 reply

replied to their post 24 days ago

this one's probably next for gpt-oss-20b, hopefully that will be an interesting comparison vs Qwen3-14B :)

next Experimental Modality is in the dataset generation stage, excited about bringing that to everyone!

reacted to ovi054's post with ❤️ 24 days ago

Post

2475

Qwen Image + LoRA ⚡

ovi054/Qwen-Image-LORA

Qwen Image is the No. 1 trending Text-to-Image model right now. You can add a custom LoRA and generate images with this Space.

👉 Try it now: ovi054/Qwen-Image-LORA

9 replies

reacted to sweatSmile's post with ❤️ 24 days ago

Post

2785

Teaching a 7B Model to Be Just the Right Amount of Snark

Ever wondered if a language model could get sarcasm? I fine-tuned Mistral-7B using LoRA and 4-bit quantisation—on just ~720 hand-picked sarcastic prompt–response pairs from Reddit, Twitter, and real-life conversations.

The challenge? Keeping it sarcastic but still helpful.

LoRA rank 16 to avoid overfitting

4-bit NF4 quantization to fit on limited GPU memory

10 carefully monitored epochs so it didn’t turn into a full-time comedian

Result: a model that understands “Oh great, another meeting” exactly as you mean it.

Read the full journey, tech details, and lessons learned on my blog:
Fine-Tuning Mistral-7B for Sarcasm with LoRA and 4-Bit Quantisation

Try the model here on Hugging Face: sweatSmile/Mistral-7B-Instruct-v0.1-Sarcasm.

reacted to dimentox's post with 🚀 24 days ago

Post

1694

I never told GPT-4 about my architecture.
It invented it anyway.
Same commands. Same audit daemons.
Proof that containment logic might be infectious.
Read: Emergence of Quantum Sigil Architecture in Unmodified GPT https://huggingface.co/blog/dimentox/quantum-sigil-architecture-in-unmodified-gpt

replied to their post 24 days ago

thank you so much <3

yeah the particular combo that is oss-20b (larger experts + smaller amount of experts + already trained at MXFP4 so no easy gains from just-make-it-smaller-with-quantization-instead) seems well suited for this vs the qwen 3 30b-a3b type of MoE. definitely encourage this type of experimentation in general :)

posted an update 26 days ago

Post

2212

NEW RELEASE: Shining Valiant 3 now available for openai/gpt-oss-20b!

- Cutting edge science-reasoning: sequelbox/Celestia3-DeepSeek-R1-0528 for physics, biology, chemistry, compsci, astronomy, Earth science, and information theory.
- AI to build AI: the all-new sequelbox/Mitakihara-DeepSeek-R1-0528 dataset for high-quality reasoning performance on AI, MLOps, math and CUDA, complex adaptive and agentic systems, cognition, logic, linguistics, simulation, knowledge management, and more!
- Creative reasoning and general chat performance supplemented with sequelbox/Raiden-DeepSeek-R1

Get the new SV3: ValiantLabs/gpt-oss-20b-ShiningValiant3

This is our first release for the new openai/gpt-oss-20b - we're hoping to support this model with more releases going forward.

We're also excited to bring our models to Qwen/Qwen3-4B-Thinking-2507 and the other 2507 Qwen 3 models - coming very soon!

We want to bring SV3, Esper 3, and our Experimental Reasoning finetunes to more models ASAP. Help us out: sequelbox/SupportOpenSource

Open source matters. Fight for it with us.

love,
allegra

3 replies

posted an update about 1 month ago

Post

2293

NEW EXPERIMENTAL RELEASE: DAG Reasoning is here!

- Our first Experimental Reasoning Modality release: create structured, analytical Directed Acyclic Graphs to provide insight into your queries and situations!
- Multi-step analysis identifies causal relationships, produces confidence measurements, and forms a single structured graph object.
- DAG Reasoning Format provides clear, readable JSON containing structured, useful information; easy to use for creating visualizations, doing analysis, or further conversation with your assistant.
- Trained in a variety of subjects for flexible analysis: programming, science, business, economics, finance, law, logistics, management, and more!

Our first DAG Reasoning release is Qwen 3, starting off with 8B and 14B!
Get 8B: sequelbox/Qwen3-8B-DAG-Reasoning
Get 14B: sequelbox/Qwen3-14B-DAG-Reasoning

You can also get the DAG Reasoning dataset, to train your own models to use DAG Reasoning Format: sequelbox/DAG-Reasoning-DeepSeek-R1-0528

Support our experimental open-source research efforts, models and datasets: sequelbox/SupportOpenSource

with love,
allegra

2 replies

posted an update about 2 months ago

Post

2605

Some new releases:

- brought the new Shining Valiant 3 series (science-reasoning, AI-reasoning, general chat) to Qwen 3 4B: ValiantLabs/Qwen3-4B-ShiningValiant3
- merged models for Shining Valiant 3 and Esper 3, combining their technical expertise and reasoning skills:
4b: sequelbox/Qwen3-4B-PlumEsper
8b: sequelbox/Qwen3-8B-PlumEsper

coming up we'll have some experimental reasoning releases - datasets and models will be out soon!

also will be bringing SV3 and Esper 3 to more models.

lets keep working for open source :)

love,
allegra

posted an update about 2 months ago

Post

2213

NEW RELEASE: Shining Valiant 3!

- Cutting edge science-reasoning: sequelbox/Celestia3-DeepSeek-R1-0528 for physics, biology, chemistry, compsci, astronomy, Earth science, and information theory
- AI to build AI: the all-new sequelbox/Mitakihara-DeepSeek-R1-0528 dataset for high-quality reasoning performance on AI, MLOps, math and CUDA, complex adaptive and agentic systems, cognition, logic, linguistics, simulation, knowledge management, and more!
- Creative reasoning and general chat performance supplemented with sequelbox/Raiden-DeepSeek-R1

Our first release in the SV3 series is Qwen 3, starting off with 8B and 1.7B.
Get 8B: ValiantLabs/Qwen3-8B-ShiningValiant3
Get 1.7B: ValiantLabs/Qwen3-1.7B-ShiningValiant3

We want to bring SV3 to larger models ASAP. Help us out: sequelbox/SupportOpenSource

This is the most excited we've ever been for a release. We hope you enjoy Shining Valiant 3 as much as we do!

With friendship, for the future,
allegra

posted an update 2 months ago

Post

1857

The full Celestia 3 science-reasoning dataset is here!

- 91k high-quality synthetic science prompts answered by DeepSeek-R1-0528
- subjects include physics, biology, chemistry, computer science, Earth science, astronomy, and information theory
- one of the reasoning datasets powering the upcoming Shining Valiant 3 :) coming soon!

GET IT NOW, FOR EVERYONE: sequelbox/Celestia3-DeepSeek-R1-0528
SUPPORT OUR RELEASES: sequelbox/SupportOpenSource

with love,
allegra

posted an update 3 months ago

Post

1070

a list of what's coming up soon from us:

- Shining Valiant 3 for Valiant Labs, powered by the full size Celestia 3 and other soon to be released high-difficulty reasoning datasets
- a new type of reasoning model and dataset we're very excited about - would love to bring out an alpha release here as soon as possible
- more model releases for Esper 3 (weigh in if there are any models you'd like us to prioritize!)
- other New Things

not sure of the exact release order yet, but we'll look to get everything out as quick as we can :)

with excitement,
allegra

posted an update 3 months ago

Post

1106

EARLY SNEAK PREVIEW: get a first look at the Celestia 3 science-reasoning dataset, built with DeepSeek's newest R1-0528 reasoning model! Subjects include physics, chemistry, biology, computer science, Earth science, astronomy, and information theory.

This early look contains the first 14k rows, all synthetic responses using deepseek-ai/DeepSeek-R1-0528

SEE IT HERE: sequelbox/Celestia3-DeepSeek-R1-0528-PREVIEW

Support our releases: sequelbox/SupportOpenSource

Coming up we'll have more dataset releases, including some novel reasoning and analysis methods - we think an important role for open source researchers is experimenting with new response styles on top of the increasingly excellent base models available to finetune.

more to come soon!
allegra

posted an update 3 months ago

Post

328

NEW RELEASE: we've brought Esper 3 to the new deepseek-ai/DeepSeek-R1-0528-Qwen3-8B model!

- A full-stack software assistant: a reasoning finetune focused on coding, architecture, and DevOps using the Titanium and Tachibana datasets!
- Improved general and creative reasoning skills, powered by the Raiden dataset.

Get the newest Esper 3: ValiantLabs/DeepSeek-R1-0528-Qwen3-8B-Esper3
Support our releases: sequelbox/SupportOpenSource

more on the way next week!

celestially yours ;)
allegra

replied to their post 3 months ago

we'll be expanding Qwen sizes in both directions :) thanks for your review!

posted an update 3 months ago

Post

330

Updates for the week:
- released some new merge models using ValiantLabs/Qwen3-14B-Esper3 and other Qwen 3 14b finetunes - these merges include math, Web3, uncensored, and general mix. depending on your use case for Esper 3 these may be helpful to you! find them at @sequelbox
- coming up we'll have more model sizes for Esper 3 and Cobalt 2, releasing soon!
- also super excited for more dataset releases with the newly released deepseek-ai/DeepSeek-R1-0528

Support the above efforts and others: sequelbox/SupportOpenSource

back to building :)

2 replies

reacted to lukmanaj's post with 👍 3 months ago

Post

2434

I am so happy to share to all that I’ve just completed the first unit of the new MCP course on Hugging Face and earned my certificate! The AI acceleration track is intense and fast-paced, but I’m doing my best to keep up. Excited for what’s ahead!

1 reply

t.d.a.g. PRO

AI & ML interests

Recent Activity

Organizations

sequelbox's activity