HuggingFace Papers Jun 30, 2026 8/10 signal

Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent

agenticreasoning

What happened

Introduces Agents-A1, a 35B Mixture-of-Experts (MoE) agentic model that matches the performance of trillion-parameter models. It achieves this through long-horizon trajectory scaling and heterogeneous agent ability scaling using a three-stage training recipe: supervised fine-tuning, domain-level teacher models, and multi-teacher distillation.

Why it matters

Demonstrates that trajectory scaling and multi-teacher distillation can elevate a 35B model to trillion-parameter agentic performance.

The take

This is a highly practical approach to the 'small but specialized' agent trend. Instead of scaling model parameters, the authors scale the agentic trajectory (reasoning steps) and distill specialized capabilities from multiple larger teacher models. This is a blueprint for teams wanting to deploy production-grade agents on cost-effective, smaller hardware.

Do this

Read the paper to understand their three-stage training and distillation pipeline for building highly capable, smaller agentic models.

Read the source →

Don't read this site daily. Get it in your inbox.

The daily brief and Sunday deep dive — distilled, scored, and opinionated. For builders only.