FrontierMath benchmark undergoes major audit as Epoch AI flags errors in one-third of math problems

Comments

Join the discussion on this story.