Claude Opus 4.8: Hits 69.2% on SWE-Bench Pro
Claude Opus 4.8 scores 69.2% on SWE-Bench Pro for agentic coding leadership while adding honesty cues and remaining available at prior pricing via EasyRouterIO.
SourceAnalysis
Claude Opus 4.8 posts 69.2% on SWE-Bench Pro to extend its lead in agentic coding benchmarks, yet still trails GPT-5.5 on Terminal-Bench 2.1. The update introduces clearer self-assessment language such as admitting uncertainty, a trait absent in earlier Opus releases, and EasyRouterIO now runs the model live with 400 free credits on signup.
傅盛
@FuSheng_0306Chairman and CEO of Cheetah Mobile, Chairman of OrionStar