Humanity's Last Exam is still accepting questions from late contributors and submissions for the dataset and co-authorship, but new submissions are not eligible for the prize pool.
New Submission(for new contributors)
Sign In Dashboard(for current contributors)
Current Contributors
HLS Logo

Humanity's Last Exam

Paper(arXiv coming soon)
Hugging FaceDatasetload_dataset("cais/hle")
CAIS Logo&Scale AI Logo

Long Phan*1, Alice Gatti*1, Ziwen Han*2, Nathaniel Li*1

Josephina Hu2, Hugh Zhang, Sean Shi2, Michael Choi2, Anish Agrawal2, Arnav Chopra2

Adam Khoja1, Ryan Kim, Richard Ren1, Jason Hausenloy1, Oliver Zhang1, Mantas Mazeika1

Summer Yue**2, Alexandr Wang**2, Dan Hendrycks**1

1Center for AI Safety, 2Scale AI

Authors

Daron Anderson, Tung Nguyen, Mobeen Mahmood, Fiona Feng, Steven Y. Feng, Haoran Zhao, Michael Yu, Chelsea Zou, Zihan Wang, Jessica P. Wang, Pawan Kumar, Oleksandr Pokutnyi, Robert Gerbicz, Serguei Popov, John-Clark Levin, Johannes Schmitt, Geoff Galgon, Alvaro Sanchez, Yongki Lee, Will Yeadon, Scott Sauers, Marc Roth, Chidozie Agu, Søren Riis, Fabian Giska, Saiteja Utpala, Zachary Giboney, Gashaw M. Goshu, Joan of Arc Xavier, Sarah-Jane Crowson, Mohinder Maheshbhai Naiya, Noah Burns, Lennart Finke, Zerui Cheng, Hyunwoo Park, Francesco Fournier-Facio, John Wydallis, Mark Nandor, Ankit Singh, Tim Gehrunger, Jiaqi Cai, Ben McCarty, Darling Duclosel, Jungbae Nam, Jennifer Zampese, Ryan G. Hoerr, Aras Bacho, Gautier Abou Loume , Abdallah Galal, Hangrui Cao, Alexis C Garretson, Damien Sileo, Qiuyu Ren, Doru Cojoc, Pavel Arkhipov, Usman Qazi, Lianghui Li, Sumeet Motwani, Christian Schroeder de Witt, Edwin Taylor, Johannes Veith, Taylor D. Hartman, Paolo Rissone, Jaehyeok Jin, Jack Wei Lun Shi, Chris G. Willcocks, Joshua Robinson, Aleksandar Mikov, Ameya Prabhu, Longke Tang, Xavier Alapont, Kevin Zhou, Emily de Oliveira Santos, Andrey Pupasov Maksimov, Edward Vendrow, Kengo Zenitani, Julien Guillod, Yuqi Li, Joshua Vendrow, Vladyslav Kuchkin , Ng Ze-An, Pierre Marion, Denis Efremov, Jayson Lynch, Kaiqu Liang, Andrew Gritsevskiy, Dakotah Martinez, Ben Pageler, Nick Crispino, Dimitri Zvonkine, Natanael Wildner Fraga, Saeed Soori, Ori Press, Henry Tang, Julian Salazar, Sean R. Green, Lina Brüssel, Moon Twayana, Aymeric Dieuleveut, T. Ryan Rogers, Wenjin Zhang, Bikun Li, Jinzhou Yang, Arun Rao, Gabriel Loiseau, Mikhail Kalinin, Marco Lukas, Ciprian Manolescu, Subrata , Ariel Ghislain Kemogne Kamdoum, Tobias Kreiman, Tad Hogg, Alvin Jin, Carlo Bosio, Gongbo Sun, Brian P Coppola, Tim Tarver, Haline Heidinger, Rafael Sayous, Stefan Ivanov, Joseph M Cavanagh, Jiawei Shen, Joseph Marvin Imperial, Philippe Schwaller, Shaipranesh Senthilkuma, Andres M Bran, Ali Dehghan, Andres Algaba, Brecht Verbeken, David Noever, Ragavendran P V, Lisa Schut, Ilia Sucholutsky, Evgenii Zheltonozhskii, Derek Lim, Richard Stanley, Shankar Sivarajan , Tong Yang, John Maar, Julian Wykowski, Martí Oller, Jennifer Sandlin, Anmol Sahu, Yuzheng Hu, Sara Fish, Nasser Heydari, Archimedes Apronti, Kaivalya Rawal, Tobias Garcia Vilchis, Yuexuan Zu, Martin Lackner, James Koppel, Jeremy Nguyen, Daniil S. Antonenko, Steffi Chern, Bingchen Zhao, Pierrot Arsene, Alan Goldfarb, Sergey Ivanov, Rafał Poświata, Chenguang Wang, Daofeng Li, Donato Crisostomi, Andrea Achilleos, Benjamin Myklebust, Archan Sen, David Perrella, Nurdin Kaparov, Mark H Inlow, Allen Zang, Elliott Thornley, Daniil Orel, Vladislav Poritski, Shalev Ben-David, Zachary Berger, Parker Whitfill, Michael Foster, Daniel Munro, Linh Ho, Dan Bar Hava, Aleksey Kuchkin, Robert Lauff, David Holmes, Frank Sommerhage, Keith Schneider, Zakayo Kazibwe, Nate Stambaugh, Mukhwinder Singh, Ilias Magoulas, Don Clarke, Dae Hyun Kim, Felipe Meneguitti Dias, Veit Elser, Kanu Priya Agarwal, Victor Efren Guadarrama Vilchis, Immo Klose, Christoph Demian, Ujjwala Anantheswaran, Adam Zweiger, Guglielmo Albani, Jeffery Li, Nicolas Daans, Maksim Radionov, Václav Rozhoň, Ziqiao Ma, Christian Stump, Mohammed Berkani, Jacob Platnick, Volodymyr Nevirkovets, Luke Basler, Marco Piccardo, Ferenc Jeanplong, Niv Cohen, Varun Gangal, Josef Tkadlec, Paul Rosu, Piotr Padlewski, Stanislaw Barzowski, Kyle Montgomery, Aline Menezes, Arkil Patel, Zixuan Wang, Jamie Tucker-Foltz, Jack Stade, Tom Goertzen, Fereshteh Kazemi, Jeremiah Milbauer, John Arnold Ambay, Abhishek Shukla, Yan Carlos Leyva Labrador, Alan Givré, Hew Wolff, Vivien Rossbach , Muhammad Fayez Aziz, Younesse Kaddar, Yanxu Chen, Robin Zhang, Jiayi Pan, Antonio Terpin, Niklas Muennighoff, Hailey Schoelkopf, Eric Zheng, Avishy Carmi, Adam Jones, Jainam Shah, Ethan D. L. Brown, Kelin Zhu, Max Bartolo, Richard Wheeler, Andrew Ho, Shaul Barkan, Jiaqi Wang, Martin Stehberger, Egor Kretov, Kaustubh Sridhar, Zienab EL-Wasif, Anji Zhang, Daniel Pyda, Joanna Tam, David M. Cunningham, Demosthenes Patramanis, Michael Krause, Andrew Redenti, Daniel Bugas, David Aldous, Jesyin Lai, Shannon Coleman, Mohsen Bahaloo, Jiangnan Xu, Sangwon Lee, Sandy Zhao, Ning Tang, Michael K. Cohen, Micah Carroll, Orr Paradise, Jan Hendrik Kirchner, Stefan Steinerberger, Maksym Ovchynnikov, Jason O. Matos, Adithya Shenoy, Benedito Alves de Oliveira Junior, Michael Wang, Yuzhou Nie, Paolo Giordano, Philipp Petersen, Anna Sztyber-Betley, Priti Shukla, Jonathan Crozier, Antonella Pinto, Shreyas Verma, Prashant Joshi, Zheng-Xin Yong, Allison Tee, Jérémy Andréoletti, Orion Weller, Raghav Singhal, Gang Zhang, Alexander Ivanov, Seri Khoury, Hamid Mostaghimi, Kunvar Thaman, Qijia Chen, Trần Quốc Khánh, Jacob Loader, Stefano Cavalleri, Hannah Szlyk, Zachary Brown, Jonathan Roberts, William Alley, Kunyang Sun, Ryan Stendall, Max Lamparth, Anka Reuel, Ting Wang, Hanmeng Xu, Sreenivas Goud Raparthi, Pablo Hernández-Cámara, Freddie Martin, Dmitry Malishev, Thomas Preu, Tomek Korbak, Marcus Abramovitch, Dominic Williamson, Ziye Chen, Biró Bálint, M Saiful Bari, Peyman Kassani, Zihao Wang, Behzad Ansarinejad, Laxman Prasad Goswami, Yewen Sun, Hossam Elgnainy, Daniel Tordera, George Balabanian, Earth Anderson, Lynna Kvistad, Alejandro José Moyano, Rajat Maheshwari , Ahmad Sakor, Murat Eron, Isaac C. McAlister, Javier Gimenez, Innocent Enyekwe, Andrew Favre D.O., Shailesh Shah, Xiaoxiang Zhou, Firuz Kamalov, Ronald Clark, Sherwin Abdoli, Khalida Meer, Harrison K Wang, Evan Chen, Alessandro Tomasiello, Shi-Zhuo Looi, Vinh-Kha Le, Noam Kolt, Niels Mündler, Avi Semler, Emma Rodman, Jacob Drori, Carl J Fossum, Milind Jagota, Ronak Pradeep, Honglu Fan, Tej Shah, Jonathan Eicher , Michael Chen, Kushal Thaman, William Merrill, Carter Harris, Jason Gross, Ilya Gusev, Asankhaya Sharma, Shashank Agnihotri, Pavel Zhelnov, Siranut Usawasutsakorn, Mohammadreza Mofayezi, Sergei Bogdanov, Alexander Piperski, Marc Carauleanu, David K. Zhang, Dylan Ler, Roman Leventov, Ignat Soroko, Thorben Jansen, Pascal Lauer, Joshua Duersch, Vage Taamazyan, Wiktor Morak, Wenjie Ma, William Held, Tran Đuc Huy, Ruicheng Xian, Armel Randy Zebaze, Mohanad Mohamed, Julian Noah Leser, Michelle X Yuan, Laila Yacar, Johannes Lengler, Hossein Shahrtash, Edson Oliveira, Joseph W. Jackson, Daniel Espinosa Gonzalez, Andy Zou, Muthu Chidambaram, Timothy Manik, Hector Haffenden, Dashiell Stander, Ali Dasouqi, Alexander Shen, Emilien Duc, Bita Golshani, David Stap, Mikalai Uzhou, Alina Borisovna Zhidkovskaya, Lukas Lewark, Mátyás Vincze, Dustin Wehr, Colin Tang, Zaki Hossain, Shaun Phillips, Jiang Muzhen, Fredrik Ekström, Angela Hammon, Oam Patel, Nicolas Remy, Faraz Farhidi, George Medley , Forough Mohammadzadeh, Madellene Peñaflor, Haile Kassahun, Alena Friedrich, Claire Sparrow, Taom Sakal, Omkar Dhamane, Ali Khajegili Mirabadi, Eric Hallman, Mike Battaglia, Mohammad Maghsoudimehrabani, Hieu Hoang, Alon Amit, Dave Hulbert, Roberto Pereira, Simon Weber, Stephen Mensah, Nathan Andre, Anton Peristyy, Chris Harjadi, Himanshu Gupta , Stephen Malina, Samuel Albanie, Will Cai, Mustafa Mehkary , Frank Reidegeld, Anna-Katharina Dick, Cary Friday, Jasdeep Sidhu, Wanyoung Kim, Mariana Costa, Hubeyb Gurdogan, Brian Weber, Harsh Kumar , Tong Jiang, Arunim Agarwal, Chiara Ceconello, Warren S. Vaz, Chao Zhuang, Haon Park, Andrew R. Tawfeek, Daattavya Aggarwal, Michael Kirchhof, Linjie Dai, Evan Kim, Johan Ferret, Yuzhou Wang, Minghao Yan, Krzysztof Burdzy, Lixin Zhang, Antonio Franca, Diana T. Pham, Kang Yong Loh, Shreen Gul, Gunjan Chhablani, Zhehang Du, Adrian Cosma, Colin White, Robin Riblet, Prajvi Saxena, Jacob Votava, Vladimir Vinnikov, Shiv Halasyamani, Syed M. Shahid, Jean-Christophe Mourrat, Lavr Vetoshkin, Renas Bacho, Vincent Ginis, Aleksandr Maksapetyan, Florencia de la Rosa, Xiuyu Li, Guillaume Malod, Leon Lang, Julien Laurendeau, Fatimah Adesanya , Julien Portier, Lawrence Hollom, Victor Souza, Yuchen Anna Zhou, Yiğit Yalın, Gbenga Daniel Obikoya, Luca Arnaboldi, Rai (Michael Pokorny), Filippo Bigi, Kaniuar Bacho, Pierre Clavier, Gabriel Recchia, Mara Popescu, Nikita Shulga, Ngefor Mildred Tanwie , Thomas C.H. Lux, Ben Rank, Colin Ni, Alesia Yakimchyk, Huanxu (Quinn) Liu , Olle Häggström, Emil Verkama, Himanshu Narayan , Hans Gundlach, Leonor Brito-Santana, Brian Amaro, Vivek Vajipey, Rynaa Grover, Yiyang Fan, Gabriel Poesia Reis e Silva, Linwei Xin, Yosi Kratish, Jakub Łucki, Wen-Ding Li, Justin Xu, Kevin Joseph Scaria, Freddie Vargus, Farzad Habibi, Long (Tony) Lian, Emanuele Rodolà, Jules Robins, Vincent Cheng, Declan Grabb, Ida Bosio, Tony Fruhauff, Ido Akov, Eve J. Y. Lo, Hao Qi, Xi Jiang, Ben Segev, Jingxuan Fan, Sarah Martinson, Erik Y. Wang, Kaylie Hausknecht, Michael P. Brenner, Mao Mao, Yibo Jiang, Xinyu Zhang, David Avagian, Eshawn Jessica Scipio, Muhammad Rehan Siddiqi, Alon Ragoler, Justin Tan, Deepakkumar Patil, Rebeka Plecnik, Aaron Kirtland, Roselynn Grace Montecillo, Stephane Durand, Omer Faruk Bodur, Zahra Adoul, Mohamed Zekry , Guillaume Douville, Ali Karakoc, Tania C. B. Santos, Samir Shamseldeen, Loukmane Karim, Anna Liakhovitskaia, Nate Resman , Nicholas Farina, Juan Carlos Gonzalez, Gabe Maayan, Sarah Hoback, Rodrigo De Oliveira Pena, Glen Sherman, Hodjat Mariji, Rasoul Pouriamanesh, Wentao Wu, Gözdenur Demir, Sandra Mendoza, Ismail Alarab, Joshua Cole, Danyelle Ferreira, Bryan Johnson , Hsiaoyun Milliron, Mohammad Safdari, Liangti Dai, Siriphan Arthornthurasuk, Alexey Pronin, Angel Ramirez-Trinidad, Ashley Cartwright, Daphiny Pottmaier, Omid Taheri, David Outevsky, Stanley Stepanic, Samuel Perry, Luke Askew, Raúl Adrián Huerta Rodríguez , Abdelkader Dendane, Ricardo Lorena, Krishnamurthy Iyer, Sk Md Salauddin, Murat Islam, Juan Gonzalez, Josh Ducey, Russell Campbell, Maja Somrak, Vasilios Mavroudis, Eric Vergo, Juehang Qin, Benjámin Borbás, Eric Chu, Jack Lindsey, Anil Radhakrishnan, Antoine Jallon, I.M.J. McInnis, Alex Hoover, Sören Möller, Tejal Patwardhan

Affiliations

3Independent Researcher, 4Texas A&M University, 5McGill University, 6Queen's University, 7Stanford University, 8University of Washington, 9University of California, San Diego, 10RWTH Aachen University, 11Pondicherry Engineering College, 12Institute of Mathematics of NAS of Ukraine, 13ELTE, 14University of Porto, 15University of Cambridge, 16ETH Zürich, 17Nimbus AI, 18Georgia Southern University, 19Durham University, 20University of Minnesota Twin Cities, 21Queen Mary University of London, 22Alberta Health Services, 23Microsoft Research, 24ZG Law, 25Outlier, 26Hereford College of Arts, 27Auckland University of Technology, 28Princeton University, 29Carnegie Mellon University, 30Hemwati Nandan Bahuguna Garhwal University, 31Massachusetts Institute of Technology, 32Accenture Labs, 33Escuela Superior de Medicina- Instituto Politécnico Nacional, 34CICMA, 35University of Canterbury, 36Metropolitan State University of Denver, 37California Institute of Technology, 38Université de Yaoundé I, 39Ecole Nationale Supérieure Polytechnique de Yaoundé, 40Tanta University, 41Tufts University, 42The Jackson Laboratory, 43Inria, 44University of California, Berkeley, 45Columbia University, 46Institute of Science and Technology Austria, 47RUSM, 48University of British Columbia, 49École Polytechnique Fédérale de Lausanne, 50University of Oxford, 51Charité – Universitätsmedizin, 52Humboldt-Universität zu Berlin, 53Northern Illinois University, 54Sapienza University of Rome, 55National University of Singapore, 56The Hartree Centre, 57University of Tübingen, 58University of Sao Paulo, 59Universidade Federal de Juiz de Fora, 60Sorbonne Université, 61École Normale Supérieure, 62C. N. Yang institute for Theoretical Physics, 63University of Luxembourg, 64University of Malaya, 65Rockwell Automation, 66Contramont Research, 67Washington University, 68CNRS, 69Université Paris-Saclay, 70University of Toronto, 71Google DeepMind, 72University of North Texas, 73Institut Polytechnique de Paris, 74TRR Designs, 75University of Chicago, 76Maastricht University, 77University of California, Los Angeles, 78Martin-Luther-University Halle-Wittenberg, 79Leibniz University Hannover, 80Indian Institute of Technology Bombay, 81University of Calgary, 82Institute for Molecular Manufacturing, 83University of Wisconsin-Madison, 84University of Michigan, 85Bethune-Cookman University, 86St. Petersburg College, 87La Molina National Agrarian University, 88University of Bath, 89National University Philippines, 90Vrije Universiteit Brussel, 91PeopleTec, Inc., 92New York University, 93Technion – Israel Institute of Technology, 94University of Miami, 95University of Maryland, 96Technische Universität Berlin, 97Arizona State University, 98University of Illinois Urbana-Champaign, 99Harvard University, 100Royal Holloway, University of London, 101Universidad Iberoamericana, 102TU Wien, 103Swinburne University of Technology, 104Yale University, 105University of Edinburgh, 106École Normale Supérieure Paris-Saclay, 107National Information Processing Institute, 108University College London, 109Ecco IT, 110University of Western Australia, 111Snorkel AI, 112Indiana State University, 113Oxford University, 114Mohamed bin Zayed University of Artificial Intelligence, 115University of Waterloo, 116Manhattan School of Music, 117Universiteit Leiden, 118Synbionix, 119Corteva Agriscience, 120Diverging Mathematics, 121Saint Mary's University, 122Emory University, 123Sanford Burnham Preybs, 124Yonsei University, 125Cornell University, 126University of Leeds, 127Politecnico di Milano, 128KU Leuven, 129Brandenburg University of Technology, 130INSAIT, 131Ruhr University Bochum, 132University Mohammed I, 133Georgia Institute of Technology, 134Northwestern University, 135University of Arizona, 136Universidade de Lisboa,, 137Mānuka Honey and Beekeeping Consultancy Ltd, 138Charles University, 139Duke University, 140Mila, 141University of Copenhagen, 142The University of Sydney, 143University of Technology Sydney, 144Indian Institute of Technology Delhi, 145University of Buenos Aires, 146University of Amsterdam, 147Ben-Gurion University, 148blurrylogic, 149Donald and Barbara Zucker School of Medicine, 150Cohere, 151Ivy Natal, 152Hebrew University, 153Fraunhofer IMTE, 154University of Pennsylvania, 155National Institute of Laser Enhanced Sciences, 156Drexel University, 157Northeastern University, 158EHC Investments LLC, 159University of Windsor, 160St. Jude Children’s Research Hospital, 161GC, 162Rochester Institute of Technology, 163Anthropic, 164CERN, 165University of California, Santa Barbara, 166University of Vienna, 167Warsaw University of Technology, 168EF Polymers Pvt Ltd, 169North Carolina State University, 170Independent researcher, 171Simplr AI, Asurion, 172All India Institute of Medical Sciences, 173Brown University, 174Johns Hopkins University, 175Ruhr-Universität Bochum, 176Standard Intelligence, 177Posts and Telecommunications Institute of Technology, 178Clearhorse Ltd, 179Cranfield University, 180JNTU, 181Image Processing Lab, Universitat de Valencia, 182Universität Zürich, 183UK AI Safety Institute, 184Boston University, 185SDAIA, 186Children’s Hospital of Orange County, 187The Ohio State University, 188Cairo University Specialized Pediatric Hospital, 189Universidad de Valencia, 190University of Arkansas, 191Monash University, 192OncoPrecision, 193Genomia Diagnostics Research Pvt Ltd, 194IEEE Life Member, 195Larkin Community Hospital, 196The University of Texas at Dallas, 197Canadian University Dubai, 198Università di Milano-Bicocca, 199University of Massachusetts Lowell, 200Virginia Tech, 201University of Geneva, 202Rutgers University, 203MolMind, 204Cal Poly San Luis Obispo, 205Patched Codes, Inc, 206University of Mannheim, 207Chulalongkorn University, 208Ecole polytechnique, 209Stockholm University, 210AE Studio, 211Gaia Lab, 212Leibniz Institute for Science and Mathematics Education, 213Australian National University, 214Saarland University, 215College of Eastern Idaho, 216Intrinsic Innovation LLC, 217HUTECH, 218INRIA, 219King Saud University, 220Universidad de Buenos Aires, 221Pennsylvania College of Technology, 222CERo Therapeutics Holdings, Inc., 223The Univeirsty of Tennessee, 224Gray Swan AI, 225EleutherAI, 226University of Montpellier, 227HomeEquity Bank, 228Materials Platform for Data Science LLC, 229University of Trento, 230Fondazione Bruno Kessler, 231Cambridge University, 232LGM, 233Georgia State University, 234Polytechnic University of the Philippines, 235University of Oregon, 236University of Mumbai, 237University of Guelph, 238Case Wester Reserve University, 239Intuit, 240CTTC / CERCA, 241National University, 242Talishar, 243Dyno Therapeutics, 244The Hospital for Sick Children, 245Lewis Katz School of Medicine, 246Fyaora Labs, 247Intelligent Geometries, 248Indian Institute of Technology (BHU), 249Center for AI Safety, 250AIM Intelligence, 251Seoul National University, 252The University of Texas at Arlington, 253Missouri University of Science and Technology, 254POLITEHNICA Bucharest National University of Science and Technology, 255Abacus.AI, 256German Research Center for Artificial Intelligence, 257University of Houston, 258Eastern Institute of Technology (EIT), 259ENS Lyon, 260Czech Technical University in Prague, 261CISPA Helmholtz Center for Information Security, 262Universidad de Morón, 263Université Paris Cité and Sorbonne Université, 264Sheffield Hallam University, 265The New School, 266Max Planck Institute for Software Systems, 267OpenAI, 268École Polytechnique, 269Modulo Research, 270Heidelberg University, 271La Trobe University, 272University of Yaoundé I, 273Lux Labs, 274University of Innsbruck, 275Nabu Technologies Inc, 276Chalmers University of Technology, 277KTH Royal Institute of Technology, 278Unidade Local de Saúde de Lisboa Ocidental, 279Quotient AI, 280University of California, Irvine, 281University of Padua, 282Aalto University, 283Royal Veterinary College, 284The Future Paralegals of America, 285RMIT University, 286Universal Higher Education, 287Eastlake High School, 288CSMSS Chh. Shahu College of Engineering, 289Central Mindanao University, 290University of Montreal, 291University of Bradford, 292Beni Suef University, 293Bogazici University, 294Mansoura University, 295Univerisity of Bristol, 296University of Oklahoma, 297Jala University, 298Florida Atlantic University, 299CONICET, 300Universidad Tecnológica Nacional, 301Bournemouth University, 302University of Warwick, 303University of Alabama Huntsville, 304Van Andel Institute, 305University of Hertfordshire, 306Central College, 307Sheffield Teaching Hospitals NHS Foundation Trust, 308Nottingham Trent University, 309Max Planck Institute for Intelligent Systems, 310Outevsky Bespoke Dance Education, 311University of Virginia, 312Dartmouth College, 313INESC Microsistemas e Nanotecnologias, 314University of Minnesota, 315Aligarh Muslim University, 316John Crane UK Ltd, 317James Madison University, 318University of the Fraser Valley, 319Alan Turing Institute, 320Rice University, 321HUN-REN, 322Forschungszentrum Jülich

Introduction

Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam, a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. The dataset consists of 3,000 challenging questions across over a hundred subjects. We publicly release these questions, while maintaining a private test set of held out questions to assess model overfitting.

Difficulty comparison across benchmarks

Compared against the saturation of some existing benchmarks, Humanity's Last Exam accuracy remains low across several frontier models, demonstrating its effectiveness for measuring advanced, closed-ended, academic capabilities.

Dataset

Humanity's Last Exam (HLE) is a global collaborative effort, with questions from nearly 1,000 subject expert contributors affiliated with over 500 institutions across 50 countries – comprised mostly of professors, researchers, and graduate degree holders.

Examples 1-2/8

Classics

Question:

Question image

Here is a representation of a Roman inscription, originally found on a tombstone. Provide a translation for the Palmyrene script.
A transliteration of the text is provided: RGYNᵓ BT ḤRY BR ᶜTᵓ ḤBL

Henry T

Merton College, Oxford

Ecology

Question:

Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tendons are supported by this sesamoid bone? Answer with a number.

Edward V

Massachusetts Institute of Technology

Samples of the diverse and challenging questions submitted to Humanity's Last Exam.

Quantitative Results

Accuracy. All frontier models achieve low accuracy on Humanity's Last Exam, highlighting significant room for improvement in narrowing the gap between current LLMs and expert-level academic capabilities on closed-ended questions.

Calibration Error. Given low performance on Humanity's Last Exam, models should be calibrated, recognizing their uncertainty rather than confidently provide incorrect answers, indicative of confabulation/hallucination. To measure calibration, we prompt models to provide both an answer and their confidence from 0% to 100%.

ModelAccuracy (%) ↑Calibration Error (%) ↓
GPT-4o logoGPT-4o3.392.5
Grok-2 logoGrok-23.893.2
Claude 3.5 Sonnet logoClaude 3.5 Sonnet4.388.9
Gemini Thinking logoGemini Thinking6.293.9
o1 logoo19.193.4
DeepSeek-R1* logoDeepSeek-R1*9.481.8

*Model is not multi-modal, evaluated on text-only subset.

Discussion

Future Model Performance

While current LLMs achieve very low accuracy on Humanity's Last Exam, recent history shows benchmarks are quickly saturated -- with models dramatically progressing from near-zero to near-perfect performance in a short timeframe. Given the rapid pace of AI development, it is plausible that models could exceed 50% accuracy on HLE by the end of 2025. High accuracy on HLE would demonstrate expert-level performance on closed-ended, verifiable questions and cutting-edge scientific knowledge, but it would not alone suggest autonomous research capabilities or "artificial general intelligence." HLE tests structured academic problems rather than open-ended research or creative problem-solving abilities, making it a focused measure of technical knowledge and reasoning. HLE may be the last academic exam we need to give to models, but it is far from the last benchmark for AI.

Impact

By providing a clear measure of AI progress, Humanity's Last Exam creates a common reference point for scientists and policymakers to assess AI capabilities. This enables more informed discussions about development trajectories, potential risks, and necessary governance measures.

Citation

@article{phan2025hle,
title={Humanity's Last Exam},
author={Phan, Long and Gatti, Alice and Han, Ziwen and Li, Nathaniel and 
Hu, Josephina and Zhang, Hugh and Shi, Sean and Choi, Michael and 
Agrawal, Anish and Chopra, Arnav and Khoja, Adam and Kim, Ryan and 
Hausenloy, Jason and Zhang, Oliver and Mazeika, Mantas and Anderson, Daron and Nguyen, Tung and Mahmood, Mobeen and Feng, Fiona and Y. Feng, Steven and Zhao, Haoran and Yu, Michael and Zou, Chelsea and Wang, Zihan and P. Wang, Jessica and Kumar, Pawan and Pokutnyi, Oleksandr and Gerbicz, Robert and Popov, Serguei and Levin, John-Clark and Schmitt, Johannes and Galgon, Geoff and Sanchez, Alvaro and Lee, Yongki and Yeadon, Will and Sauers, Scott and Roth, Marc and Agu, Chidozie and Riis, Søren and Giska, Fabian and Utpala, Saiteja and Giboney, Zachary and M. Goshu, Gashaw and of Arc Xavier, Joan and Crowson, Sarah-Jane and Maheshbhai Naiya, Mohinder and Burns, Noah and Finke, Lennart and Cheng, Zerui and Park, Hyunwoo and Fournier-Facio, Francesco and Wydallis, John and Nandor, Mark and Singh, Ankit and Gehrunger, Tim and Cai, Jiaqi and McCarty, Ben and Duclosel, Darling and Nam, Jungbae and Zampese, Jennifer and G. Hoerr, Ryan and Bacho, Aras and Abou Loume , Gautier and Galal, Abdallah and Cao, Hangrui and C Garretson, Alexis and Sileo, Damien and Ren, Qiuyu and Cojoc, Doru and Arkhipov, Pavel and Qazi, Usman and Li, Lianghui and Motwani, Sumeet and Schroeder de Witt, Christian and Taylor, Edwin and Veith, Johannes and D. Hartman, Taylor and Rissone, Paolo and Jin, Jaehyeok and Wei Lun Shi, Jack and G. Willcocks, Chris and Robinson, Joshua and Mikov, Aleksandar and Prabhu, Ameya and Tang, Longke and Alapont, Xavier and Zhou, Kevin and de Oliveira Santos, Emily and Pupasov Maksimov, Andrey and Vendrow, Edward and Zenitani, Kengo and Guillod, Julien and Li, Yuqi and Vendrow, Joshua and Kuchkin , Vladyslav and Ze-An, Ng and Marion, Pierre and Efremov, Denis and Lynch, Jayson and Liang, Kaiqu and Gritsevskiy, Andrew and Martinez, Dakotah and Pageler, Ben and Crispino, Nick and Zvonkine, Dmimitri and Wildner Fraga, Natanael and Soori, Saeed and Press, Ori and Tang, Henry and Salazar, Julian and R. Green, Sean and Brüssel, Lina and Twayana, Moon and Dieuleveut, Aymeric and Ryan Rogers, T. and Zhang, Wenjin and Li, Bikun and Yang, Jinzhou and Rao, Arun and Loiseau, Gabriel and Kalinin, Mikhail and Lukas, Marco and Manolescu, Ciprian and , Subrata and Ghislain Kemogne Kamdoum, Ariel and Kreiman, Tobias and Hogg, Tad and Jin, Alvin and Bosio, Carlo and Sun, Gongbo and P Coppola, Brian and Tarver, Tim and Heidinger, Haline and Sayous, Rafael and Ivanov, Stefan and M Cavanagh, Joseph and Shen, Jiawei and Marvin Imperial, Joseph and Schwaller, Philippe and Senthilkuma, Shaipranesh and M Bran, Andres and Dehghan, Ali and Algaba, Andres and Verbeken, Brecht and Noever, David and P V, Ragavendran and Schut, Lisa and Sucholutsky, Ilia and Zheltonozhskii, Evgenii and Lim, Derek and Stanley, Richard and Sivarajan , Shankar and Yang, Tong and Maar, John and Wykowski, Julian and Oller, Martí and Sandlin, Jennifer and Sahu, Anmol and Hu, Yuzheng and Fish, Sara and Heydari, Nasser and Apronti, Archimedes and Rawal, Kaivalya and Garcia Vilchis, Tobias and Zu, Yuexuan and Lackner, Martin and Koppel, James and Nguyen, Jeremy and S. Antonenko, Daniil and Chern, Steffi and Zhao, Bingchen and Arsene, Pierrot and Goldfarb, Alan and Ivanov, Sergey and Poświata, Rafał and Wang, Chenguang and Li, Daofeng and Crisostomi, Donato and Achilleos, Andrea and Myklebust, Benjamin and Sen, Archan and Perrella, David and Kaparov, Nurdin and H Inlow, Mark and Zang, Allen and Thornley, Elliott and Orel, Daniil and Poritski, Vladislav and Ben-David, Shalev and Berger, Zachary and Whitfill, Parker and Foster, Michael and Munro, Daniel and Ho, Linh and Bar Hava, Dan and Kuchkin, Aleksey and Lauff, Robert and Holmes, David and Sommerhage, Frank and Schneider, Keith and Kazibwe, Zakayo and Stambaugh, Nate and Singh, Mukhwinder and Magoulas, Ilias and Clarke, Don and Hyun Kim, Dae and Meneguitti Dias, Felipe and Elser, Veit and Priya Agarwal, Kanu and Efren Guadarrama Vilchis, Victor and Klose, Immo and Demian, Christoph and Anantheswaran, Ujjwala and Zweiger, Adam and Albani, Guglielmo and Li, Jeffery and Daans, Nicolas and Radionov, Maksim and Rozhoň, Václav and Ma, Ziqiao and Stump, Christian and Berkani, Mohammed and Platnick, Jacob and Nevirkovets, Volodymyr and Basler, Luke and Piccardo, Marco and Jeanplong, Ferenc and Cohen, Niv and Gangal, Varun and Tkadlec, Josef and Rosu, Paul and Padlewski, Piotr and Stanislaw Barzowski,  and Montgomery, Kyle and Menezes, Aline and Patel, Arkil and Wang, Zixuan and Tucker-Foltz, Jamie and Stade, Jack and Goertzen, Tom and Kazemi, Fereshteh and Milbauer, Jeremiah and Arnold Ambay, John and Shukla, Abhishek and Carlos Leyva Labrador, Yan and Givré, Alan and Wolff, Hew and Rossbach , Vivien and Fayez Aziz, Muhammad and Kaddar, Younesse and Chen, Yanxu and Zhang, Robin and Pan, Jiayi and Terpin, Antonio and Muennighoff, Niklas and Schoelkopf, Hailey and Zheng, Eric and Carmi, Avishy and Jones, Adam and Shah, Jainam and D. L. Brown, Ethan and Zhu, Kelin and Bartolo, Max and Wheeler, Richard and Ho, Andrew and Barkan, Shaul and Wang, Jiaqi and Stehberger, Martin and Kretov, Egor and Sridhar, Kaustubh and EL-Wasif, Zienab and Zhang, Anji and Pyda, Daniel and Tam, Joanna and M. Cunningham, David and Patramanis, Demosthenes and Krause, Michael and Redenti, Andrew and Bugas, Daniel and Aldous, David and Lai, Jesyin and Coleman, Shannon and Bahaloo, Mohsen and Xu, Jiangnan and Lee, Sangwon and Zhao, Sandy and Tang, Ning and K. Cohen, Michael and Carroll, Micah and Paradise, Orr and Hendrik Kirchner, Jan and Steinerberger, Stefan and Ovchynnikov, Maksym and O. Matos, Jason and Shenoy, Adithya and Alves de Oliveira Junior, Benedito and Wang, Michael and Nie, Yuzhou and Giordano, Paolo and Petersen, Philipp and Sztyber-Betley, Anna and Shukla, Priti and Crozier, Jonathan and Pinto, Antonella and Verma, Shreyas and Joshi, Prashant and Yong, Zheng-Xin and Tee, Allison and Andréoletti, Jérémy and Weller, Orion and Singhal, Raghav and Zhang, Gang and Ivanov, Alexander and Khoury, Seri and Mostaghimi, Hamid and Thaman, Kunvar and Chen, Qijia and Quốc Khánh, Trần and Loader, Jacob and Cavalleri, Stefano and Szlyk, Hannah and Brown, Zachary and Roberts, Jonathan and Alley, William and Sun, Kunyang and Stendall, Ryan and Lamparth, Max and Reuel, Anka and Wang, Ting and Xu, Hanmeng and Goud Raparthi, Sreenivas and Hernández-Cámara, Pablo and Martin, Freddie and Malishev, Dmitry and Preu, Thomas and Korbak, Tomek and Abramovitch, Marcus and Williamson, Dominic and Chen, Ziye and Bálint, Biró and Saiful Bari, M and Kassani, Peyman and Wang, Zihao and Ansarinejad, Behzad and Prasad Goswami, Laxman and Sun, Yewen and Elgnainy, Hossam and Tordera, Daniel and Balabanian, George and Anderson, Earth and Kvistad, Lynna and José Moyano, Alejandro and Maheshwari , Rajat and Sakor, Ahmad and Eron, Murat and C. McAlister, Isaac and Gimenez, Javier and Enyekwe, Innocent and Favre D.O., Andrew and Shah, Shailesh and Zhou, Xiaoxiang and Kamalov, Firuz and Clark, Ronald and Abdoli, Sherwin and Meer, Khalida and K Wang, Harrison and Chen, Evan and Tomasiello, Alessandro and Looi, Shi-Zhuo and Le, Vinh-Kha and Kolt, Noam and Mündler, Niels and Semler, Avi and Rodman, Emma and Drori, Jacob and J Fossum, Carl and Jagota, Milind and Pradeep, Ronak and Fan, Honglu and Shah, Tej and Shah, Tej and Eicher , Jonathan and Chen, Michael and Thaman, Kushal and Merrill, William and Harris, Carter and Gross, Jason and Gusev, Ilya and Sharma, Asankhaya and Agnihotri, Shashank and Zhelnov, Pavel and Usawasutsakorn, Siranut and Mofayezi, Mohammadreza and Bogdanov, Sergei and Piperski, Alexander and Carauleanu, Marc and K. Zhang, David and Ler, Dylan and Leventov, Roman and Soroko, Ignat and Jansen, Thorben and Lauer, Pascal and Duersch, Joshua and Taamazyan, Vage and Morak, Wiktor and Ma, Wenjie and Held, William and Đuc Huy, Tran and Xian, Ruicheng and Randy Zebaze, Armel and Mohamed, Mohanad and Noah Leser, Julian and X Yuan, Michelle and Yacar, Laila and Lengler, Johannes and Shahrtash, Hossein and Oliveira, Edson and W. Jackson, Joseph and Espinosa Gonzalez, Daniel and Zou, Andy and Chidambaram, Muthu and Manik, Timothy and Haffenden, Hector and Stander, Dashiell and Dasouqi, Ali and Shen, Alexander and Duc, Emilien and Golshani, Bita and Stap, David and Uzhou, Mikalai and Alina Borisovna Zhidkovskaya,  and Lewark, Lukas and Vincze, Mátyás and Wehr, Dustin and Tang, Colin and Hossain, Zaki and Phillips, Shaun and Muzhen, Jiang and Ekström, Fredrik and Hammon, Angela and Patel, Oam and Remy, Nicolas and Farhidi, Faraz and Medley , George and Mohammadzadeh, Forough and Peñaflor, Madellene and Kassahun, Haile and Friedrich, Alena and Sparrow, Claire and Sakal, Taom and Dhamane, Omkar and Khajegili Mirabadi, Ali and Hallman, Eric and Battaglia, Mike and Maghsoudimehrabani, Mohammad and Hoang, Hieu and Amit, Alon and Hulbert, Dave and Pereira, Roberto and Weber, Simon and Mensah, Stephen and Andre, Nathan and Peristyy, Anton and Harjadi, Chris and Gupta , Himanshu and Malina, Stephen and Albanie, Samuel and Cai, Will and Mehkary , Mustafa and Reidegeld, Frank and Dick, Anna-Katharina and Friday, Cary and Sidhu, Jasdeep and Kim, Wanyoung and Costa, Mariana and Gurdogan, Hubeyb and Weber, Brian and Kumar , Harsh and Jiang, Tong and Agarwal, Arunim and Ceconello, Chiara and S. Vaz, Warren and Zhuang, Chao and Park, Haon and R. Tawfeek, Andrew and Aggarwal, Daattavya and Kirchhof, Michael and Dai, Linjie and Kim, Evan and Ferret, Johan and Wang, Yuzhou and Yan, Minghao and Burdzy, Krzysztof and Zhang, Lixin and Franca, Antonio and T. Pham, Diana and Yong Loh, Kang and Robinson, Joshua and Gul, Shreen and Chhablani, Gunjan and Du, Zhehang and Cosma, Adrian and White, Colin and Riblet, Robin and Saxena, Prajvi and Votava, Jacob and Vinnikov, Vladimir and Halasyamani, Shiv and M. Shahid, Syed and Mourrat, Jean-Christophe and Vetoshkin, Lavr and Bacho, Renas and Ginis, Vincent and Maksapetyan, Aleksandr and de la Rosa, Florencia and Li, Xiuyu and Malod, Guillaume and Lang, Leon and Laurendeau, Julien and Adesanya , Fatimah and Portier, Julien and Hollom, Lawrence and Souza, Victor and Anna Zhou, Yuchen and Yalın, Yiğit and Daniel Obikoya, Gbenga and Arnaboldi, Luca and (Michael Pokorny), Rai and Bigi, Filippo and Bacho, Kaniuar and Clavier, Pierre and Recchia, Gabriel and Popescu, Mara and Shulga, Nikita and Mildred Tanwie , Ngefor and C.H. Lux, Thomas and Rank, Ben and Ni, Colin and Yakimchyk, Alesia and (Quinn) Liu , Huanxu and Häggström, Olle and Verkama, Emil and Narayan , Himanshu and Gundlach, Hans and Brito-Santana, Leonor and Amaro, Brian and Vajipey, Vivek and Grover, Rynaa and Fan, Yiyang and Poesia Reis e Silva, Gabriel and Xin, Linwei and Kratish, Yosi and Łucki, Jakub and Li, Wen-Ding and Xu, Justin and Joseph Scaria, Kevin and Vargus, Freddie and Habibi, Farzad and (Tony) Lian, Long and Rodolà, Emanuele and Robins, Jules and Cheng, Vincent and Grabb, Declan and Bosio, Ida and Fruhauff, Tony and Akov, Ido and J. Y. Lo, Eve and Qi, Hao and Jiang, Xi and Segev, Ben and Fan, Jingxuan and Martinson, Sarah and Y. Wang, Erik and Hausknecht, Kaylie and P. Brenner, Michael and Mao, Mao and Jiang, Yibo and Zhang, Xinyu and Avagian, David and Jessica Scipio, Eshawn and Rehan Siddiqi, Muhammad and Ragoler, Alon and Tan, Justin and Patil, Deepakkumar and Plecnik, Rebeka and Kirtland, Aaron and Grace Montecillo, Roselynn and Durand, Stephane and Faruk Bodur, Omer and Adoul, Zahra and Zekry , Mohamed and Douville, Guillaume and Karakoc, Ali and C. B. Santos, Tania and Shamseldeen, Samir and Karim, Loukmane and Liakhovitskaia, Anna and Resman , Nate and Farina, Nicholas and Carlos Gonzalez, Juan and Maayan, Gabe and Hoback, Sarah and De Oliveira Pena, Rodrigo and Sherman, Glen and Mariji, Hodjat and Pouriamanesh, Rasoul and Wu, Wentao and Demir, Gözdenur and Mendoza, Sandra and Alarab, Ismail and Cole, Joshua and Ferreira, Danyelle and Johnson , Bryan and Milliron, Hsiaoyun and Safdari, Mohammad and Dai, Liangti and Arthornthurasuk, Siriphan and Pronin, Alexey and Ramirez-Trinidad, Angel and Cartwright, Ashley and Pottmaier, Daphiny and Taheri, Omid and Outevsky, David and Stepanic, Stanley and Perry, Samuel and Askew, Luke and Adrián Huerta Rodríguez , Raúl and Dendane, Abdelkader and Lorena, Ricardo and Iyer, Krishnamurthy and Md Salauddin, Sk and Islam, Murat and Gonzalez, Juan and Ducey, Josh and Campbell, Russell and Somrak, Maja and Mavroudis, Vasilios and Vergo, Eric and Qin, Juehang and Borbás, Benjámin and Chu, Eric and Lindsey, Jack and Radhakrishnan, Anil and Jallon, Antoine and McInnis, I.M.J. and Hoover, Alex and Möller, Sören and Patwardhan, Tejal and Yue, Summer and Wang, Alexandr and Hendrycks, Dan},
journal={arXiv},
year={2025}
}
For any inquiries or feedback, please contact us at agibenchmark@safe.ai
Submit feedback to questions in the dataset via this form