HuggingFace Papers
8/10 signal
Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent
agenticreasoning
What happened
Introduces Agents-A1, a 35B Mixture-of-Experts (MoE) agentic model that matches the performance of trillion-parameter models. It achieves this through long-horizon trajectory scaling and heterogeneous agent ability scaling using a three-stage training recipe: supervised fine-tuning, domain-level teacher models, and multi-teacher distillation.
Why it matters
Demonstrates that trajectory scaling and multi-teacher distillation can elevate a 35B model to trillion-parameter agentic performance.
The take
This is a highly practical approach to the 'small but specialized' agent trend. Instead of scaling model parameters, the authors scale the agentic trajectory (reasoning steps) and distill specialized capabilities from multiple larger teacher models. This is a blueprint for teams wanting to deploy production-grade agents on cost-effective, smaller hardware.
Do this
Read the paper to understand their three-stage training and distillation pipeline for building highly capable, smaller agentic models.
Don't read this site daily. Get it in your inbox.
The daily brief and Sunday deep dive — distilled, scored, and opinionated. For builders only.