- You don't always need one giant model. Several smaller open-source models can be combined to match or beat a frontier model.
- Two main methods: merging fuses model weights into one (cheap arithmetic, no training), and mixture-of-agents has multiple models collaborate at runtime.
- The results are real. Open-model setups have topped leaderboards and beaten GPT-4-class models on some benchmarks at a fraction of the cost.
The Setup
The assumption for years was that better AI means bigger AI, one enormous model trained at huge cost. That is breaking down. A quieter approach lets several smaller, open-source models work together, and on some benchmarks the combination beats a single frontier model. No billion-dollar training run required.What Model Fusion Actually Is
There are two main ways to combine small models into something bigger. The first is model merging: take two or more fine-tuned models that share the same base, and blend their weights directly, basically arithmetic on the numbers inside them. No new training, no data, just math. Tools like mergekit do it in minutes, and merged models regularly top the open leaderboards, often beating models that cost thousands of GPU hours to train. The second is mixture-of-agents, which we will get to.How Mixture-of-Agents Works
Merging makes one model out of many. Mixture-of-agents keeps them separate and makes them collaborate. Several models answer a question, then a next layer of models reads those answers and improves on them, layer by layer, like a committee refining a draft. The surprising finding from the research is that models write better answers when they can see other models' attempts first, even weaker ones. An all-open-source mixture-of-agents scored 65.1 percent on a standard benchmark, beating GPT-4 Omni's 57.5, using no closed model at all.Why This Matters
This flips the economics of AI. If you can reach frontier-level quality by orchestrating cheap open models, you do not have to rent the most expensive single model or train your own. One study hit GPT-4-class results at roughly a twenty-fifth of the cost, and merging is reportedly around 100 times cheaper than fine-tuning for building a specialized model. For anyone who cannot spend tens of millions on compute, this is the door in.What It Means For Investors
The one model to rule them all thesis is the bull case for the biggest labs. Model fusion is the counter-argument. If small open models can be combined to match closed giants, the moat around a single frontier model gets thinner, and value shifts toward orchestration, tooling, and whoever owns distribution. It does not kill the frontier labs, raw frontier capability still leads, but it caps how much anyone can charge for being slightly ahead. Open-source stays in the game far longer than the scale-only story assumed.The Catch
Fusion is not magic. Merging only works cleanly when models share a base architecture, and a bad merge produces a confused model that is worse than its parts. Mixture-of-agents is powerful but slower and heavier at runtime, since you are paying several models to answer instead of one. The technique stretches what small models can do, it does not erase the gap with the true frontier on the hardest tasks.FAQ
Is a merged model actually as smart as a big one?On many tasks, surprisingly close, and on narrow domains sometimes better. On the very hardest frontier tasks, a single top model still wins. Fusion is about getting most of the way there for a tiny fraction of the cost.
Why would models write better answers just by seeing other models' answers?
Because each model has different strengths and blind spots, and seeing alternatives gives it reference points to correct itself. It is the same reason a group of decent experts can out-decide one brilliant one. Diversity beats raw individual power here.
Can I actually do this myself?
Yes, that is the point. Open tools like mergekit let a developer merge models on a normal machine, and mixture-of-agents can be wired up with a few open models and an API. The barrier is knowledge, not a giant compute budget.