Rohan Paul / @rohanpaul_ai:
(Thread) A US paper exhibits the perfect frontier LLM fashions clear up 0% of exhausting coding issues from Codeforces, ICPC, and IOI, domains the place skilled people nonetheless excel — That is actually BAD information of LLM’s coding ability. ☹️ The most effective Frontier LLM fashions obtain 0% on exhausting real-life Programming Contest issues, domains the place skilled people nonetheless excel. LiveCodeBench Professional, a benchmark composed of issues from Codeforces, ICPC, and IOI (“Worldwide (picture)
Supply hyperlink