Small On-Device AI Model Beats Claude Sonnet 4.5 and GPT 5

Intelligence report synthesized for precision. Verified source updates below.
Detailed Report
Palo Alto startup Zyphra has introduced ZAYA1-8B, a new reasoning-focused mixture of experts language model with just over 8 billion parameters and only 760 million active parameters.
Despite its relatively small size, Zyphra said the model delivers competitive benchmark performance against GPT-5-High, DeepSeek-V3.2, as well as Claude Sonnet 4.5.
The model is available through Hugging Face under the Apache 2.0 open source license, allowing developers and enterprises to use and customize it for commercial applications.
Zyphra said ZAYA1-8B was trained entirely using AMD Instinct MI300 GPUs, highlighting AMD’s growing role as an alternative to Nvidia hardware for AI development.
The model uses Zyphra’s proprietary MoE++ architecture, which includes Compressed Convolutional Attention, a custom MLP router system, and Learned Residual Scaling to improve efficiency and reduce memory use.
The company said reasoning capabilities were integrated during pre-training instead of being added later through post-training methods.
Zyphra also introduced a method called Markovian RSA, which allows the model to generate multiple reasoning paths while controlling context growth during inference.
According to the company, the model achieved a 91.9% score on AIME ‘25 while using far fewer active parameters than many competing systems.
Benchmark results shared by Zyphra also showed strong performance in coding, reasoning, and agent-based tasks compared with similarly sized open models.
Because of its smaller parameter count, Zyphra said ZAYA1-8B is suitable for local deployment on enterprise hardware and edge devices, helping reduce latency and cloud dependence.
The company was founded in 2021 and focuses on building open source AI systems centered on what it describes as “intelligence density.”
According to PitchBook, Zyphra reached unicorn status after raising $110 million in Series A funding in June 2025. Its investors reportedly include AMD and IBM.



