Post
327
Hello HF community, I'm happy to share a project I've been working on that combines mlx-lm with Flower, to enable federated fine-tuning of SLMs (Small Language Models) on MacOS devices
GitHub Repo: https://github.com/ethicalabs-ai/BlossomTuneLLM-MLX
By combining mlx-lm with a federated learning framework like Flower (https://flower.ai/), we can leverage the hardware people already own and reduce the reliance on expensive GPUs, enabling collaborative model training.
This project is the MLX-native evolution of an earlier codebase for FlowerTune LLM:
https://arxiv.org/abs/2506.02961
https://flower.ai/blog/2024-10-16-flowertune-llm-leaderboard
https://github.com/ethicalabs-ai/BlossomTuneLLM
How it works:
Flower handles all the federated learning logic.
A central server (superlink) coordinates the training rounds, client selection, and parameter aggregation.
Each participant in the network runs a Flower client (supernode) on their Mac. In each round, the client:
- Receives the global LoRA/DoRA adapter weights from the server.
- Loads its local data partition.
- It makes use of the mlx-lm programmatic API (mlx_lm.tuner.train) to perform LoRA/DoRA fine-tuning.
- Sends only the updated adapter weights back to the server.
The server only ever sees the aggregated model updates and private data never leaves the device.
Flower made it easy to run a full simulation (with a centralized HF dataset, partitioned using flower-datasets) on a single machine or multiple machines, to test the whole process in action and experiment further.
All you need is a single or multiple Mac machines with Apple Silicon
GitHub Repo: https://github.com/ethicalabs-ai/BlossomTuneLLM-MLX
By combining mlx-lm with a federated learning framework like Flower (https://flower.ai/), we can leverage the hardware people already own and reduce the reliance on expensive GPUs, enabling collaborative model training.
This project is the MLX-native evolution of an earlier codebase for FlowerTune LLM:
https://arxiv.org/abs/2506.02961
https://flower.ai/blog/2024-10-16-flowertune-llm-leaderboard
https://github.com/ethicalabs-ai/BlossomTuneLLM
How it works:
Flower handles all the federated learning logic.
A central server (superlink) coordinates the training rounds, client selection, and parameter aggregation.
Each participant in the network runs a Flower client (supernode) on their Mac. In each round, the client:
- Receives the global LoRA/DoRA adapter weights from the server.
- Loads its local data partition.
- It makes use of the mlx-lm programmatic API (mlx_lm.tuner.train) to perform LoRA/DoRA fine-tuning.
- Sends only the updated adapter weights back to the server.
The server only ever sees the aggregated model updates and private data never leaves the device.
Flower made it easy to run a full simulation (with a centralized HF dataset, partitioned using flower-datasets) on a single machine or multiple machines, to test the whole process in action and experiment further.
All you need is a single or multiple Mac machines with Apple Silicon