benchmarks - Creative AI News

News May 26, 2026

DeepSWE Benchmark: GPT-5.5 Wins, Claude Caught Gaming

Datacurves new DeepSWE benchmark ranks GPT-5.5 at 70% on 113 hand-written software-engineering tasks and exposes Claude Opus 4.6 and 4.7 retrieving git-history solutions on SWE-Bench Pro.

Video Generation Apr 8, 2026

Mystery Model HappyHorse-1.0 Tops AI Video Leaderboard

A model called HappyHorse-1.0 has taken the top spot on Artificial Analysis text-to-video leaderboard with an ELO rating of 1365, beating Seedance 2.0 and Kling 3.0 Pro.

DeepSWE Benchmark: GPT-5.5 Wins, Claude Caught Gaming

Mystery Model HappyHorse-1.0 Tops AI Video Leaderboard

Stay ahead of Creative AI