Ai Benchmarks News

Ai Benchmarks

Harvey Unveils Initial Results of Legal AI Benchmark LAB

Harvey's Legal Agent Benchmark (LAB) reveals frontier AI models complete less than 10% of complex legal tasks end-to-end, highlighting challenges in legal AI automation.

by Darius Baruo
May 27, 2026

Ai Benchmarks

OpenAI Abandons SWE-bench Verified After Finding 59% of Failed Tests Were Flawed

OpenAI reveals major contamination issues in SWE-bench Verified benchmark, showing frontier AI models memorized solutions and tests rejected correct code.

by Rebeca Moen
Mar 04, 2026

Ai Benchmarks

Harvey AI Launches Global Legal Benchmark for UK, Australia, Spain

Harvey's BigLaw Bench Global doubles benchmark size, testing AI legal capabilities across jurisdictions as model scores hit 90% on core tasks.

by James Ding
Feb 19, 2026

AI BENCHMARKS

Harvey Unveils Initial Results of Legal AI Benchmark LAB

OpenAI Abandons SWE-bench Verified After Finding 59% of Failed Tests Were Flawed

Harvey AI Launches Global Legal Benchmark for UK, Australia, Spain