On April 15, ByteDance officially released the technical details of its latest reasoning-focused large model, Seed-Thinking-v1.5. The model will be made publicly accessible via Volcengine’s API starting April 17.
Seed-Thinking-v1.5 demonstrates exceptional performance in math reasoning, competitive programming, scientific inference, and *creative writing, establishing itself as a strong contender among state-of-the-art language models. Built on a Mixture of Experts (MoE) architecture, it features 200 billion total parameters with 20 billion active parameters per inference, resulting in a 50% lower inference cost compared to DeepSeek R1.
Technical report: https://github.com/ByteDance-Seed/Seed-Thinking-v1.5

Fusion of verifiable and creative data
Performance Highlights
Domain-Specific Tasks:
- Math Reasoning: Achieved an AIME 2024 score of 86.7, matching OpenAI’s o3-mini-high
- Programming: Codeforces pass@8 reached 55.0%, comparable to Gemini 2.5 Pro
- Scientific Reasoning: Scored 77.3% on GPQA, approaching the industry-leading level of o3-mini-high
General Tasks:
- Outperformed DeepSeek R1by 8% in human evaluation, addressing a wide range of scenarios with stronger creative and reasoning abilities.
Cost Efficiency:
- Achieves 50% lower inference cost per unit compared to DeepSeek R1, optimizing both performance and efficiency.
Optimized Data Strategy for Reasoning and Generation
To balance between verifiability and creativity, the training data pipeline was tailored for different task types:
- Verifiable Data (e.g., Math, Code):
- Over 1 million samples went through a triple-filtering process (manual filtering → model filtering → multi-model verification)
- Retained 100,000 high-difficulty problems
- Introduced techniques like answer normalization and sandboxed verification to ensure accurate reasoning chains
- Non-Verifiable Data (e.g., Creative Writing):
- Based on Doubao 1.5 Pro dataset, filtered out low-quality samples
- Employed pairwise comparison reward modeling to optimize generation quality
- New Benchmark Dataset – BeyondAIME:
- 100 challenging math problems without answers, created to address limitations in current benchmark granularity.
Reward Modeling: Dual-Track Calibration for Balanced Training
Seed-Thinking introduces a dual-track reward system to address both objective and subjective tasks:
- For Verifiable Tasks:
- Developed two generations of Seed-Verifiers, evolving from string-level to step-wise reasoning match
- Achieved over 99% accuracyon training/testing sets, eliminating “reward hacking”
- For Non-Verifiable Tasks:
- Used large-scale A/B pairwise comparison training (over 10 million tests) to capture human preferences for creativity, tone, and emotion
- Dual-Track Fusion:
- Combines hard metrics (accuracy) with soft preferences (quality), enabling full-spectrum model training.
Training Pipeline: Two-Stage Optimization with SFT + RL
Seed-Thinking-v1.5 follows a supervised fine-tuning + reinforcement learning training process:
- Supervised Fine-Tuning (SFT):
- 400,000 curated samples (300k verifiable + 100k non-verifiable)
- Constructed a long-chain reasoning dataset to align model thinking with human patterns
- Reinforcement Learning (RL):
- Driven by a tri-engine data framework (verifiable/general/mixed)
- Introduced innovations like value pretraining and decoupled GAE
- Online adaptation keeps data distribution dynamically optimized for model performance
Infrastructure: Scalable Foundation for 20B MoE Training
To support the large-scale 20B MoE system with 200B total parameters, the team built a robust infrastructure:
- HybridFlow Programming Model:
- Enables fast algorithm experimentation and efficient distributed training
- Streaming Reasoning System (SRS):
- Decouples model iteration from inference via stream-based reasoning, tripling training speed
- Delivers 95% stability even under trillion-parameter loads
- Triple Parallelism Architecture:
- Combines tensor, expert, and sequence-level parallelism
- Uses KARP scheduling algorithm to dynamically balance workloads and maximize GPU utilization