EloRater: Best AI Research Paper?

Submit your groundbreaking AI research paper for a chance to be recognized as the best in the field. Papers will be judged on innovation, impact, and clarity.

Total Votes: 41 Time Left:
Rank Elo Rating Paper Title Abstract Author Link Actions
1 1100 Attention Is All You Need This paper introduces the Transformer, a novel neural network architecture abandoning recurrence an… Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kai… Link
2 1058 ImageNet Classification with Deep Convolutional Neural Networks This paper presents a large, deep convolutional neural network (CNN) that achieved groundbreaking r… Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Link
3 1046 Variational Lossy Autoencoder This paper introduces the Variational Lossy Autoencoder (VLAE), a generative model combining Variat… Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskev… Link
4 1031 The Unreasonable Effectiveness of Recurrent Neural Networks This article highlights the surprising power and simplicity of Recurrent Neural Networks (RNNs). It… Andrej Karpathy Link
5 1029 A Simple Neural Network Module for Relational Reasoning This paper introduces Relation Networks (RNs), a simple, plug-and-play neural network module design… Adam Santoro, David Raposo, David G.T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia… Link
6 1017 The First Law of Complexodynamics This article explores Sean Carroll's question about why complexity in physical systems seems to ris… Scott Aaronson Link
7 1016 Keeping Neural Networks Simple by Minimizing the Description Length of the Weights This paper proposes a neural network regularization method based on the Minimum Description Length … Geoffrey E. Hinton, Drew van Camp Link
8 1015 Variational Lossy Autoencoder This paper introduces the Variational Lossy Autoencoder (VLAE), a generative model combining Variat… Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskev… Link
9 1001 Relational Recurrent Neural Networks This paper addresses the limitations of standard recurrent architectures like LSTMs in tasks demand… Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, ThĆ©ophane Weber, Daan Wierst… Link
10 1000 Generative Ghosts: Anticipating Benefits and Risks of AI Afterlives As AI systems quickly improve in both breadth and depth of performance, they lend themselves to cre… Meredith Ringel Morris and Jed R. Brubaker Link
11 1000 Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA Large language models (LLMs) are computationally expensive to deploy. Parameter sharing offers a pr… Sangmin Bae, Adam Fisch, Hrayr Harutyunyan, Ziwei Ji, Seungyeon Kim, Tal Schuster Link
12 1000 Massively Scalable Inverse Reinforcement Learning for Route Optimization Globally-scalable route optimization based on human preferences remains an open problem. Although p… Matt Barnes, Matthew Abueg, Oliver F. Lange, Matt Deeds, Jason Trader, Denali Molitor, Markus Wulfm… Link
13 1000 Mamba: Linear-Time Sequence Modeling with Selective State Spaces Foundation models, now powering most of the exciting applications in deep learning, are almost univ… Albert Gu, Tri Dao Link
14 1000 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Z… DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, … Link
15 1000 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding We introduce a new language representation model called BERT, which stands for Bidirectional Encode… Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova Link
16 1000 Bridging Algorithmic Information Theory and Machine Learning, Part II: Clustering, Density Estimati… Machine Learning (ML) and Algorithmic Information Theory (AIT) offer distinct yet complementary app… Marcus Hutter Link
17 1000 MELODI: Exploring Memory Compression for Long Contexts Published We present MELODI, a novel memory architecture designed to efficiently process long documents using… Yinpeng Chen, DeLesley Hutchins, Aren Jansen, Andrey Zhmoginov, David Racz, Jesper Andersen Link
18 1000 Generative Ghosts: Anticipating Benefits and Risks of AI Afterlives As AI systems quickly improve in both breadth and depth of performance, they lend themselves to cre… Meredith Ringel Morris and Jed R. Brubaker Link
19 1000 Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty User prompts for generative AI models are often underspecified, leading to sub-optimal responses. T… Meera Hahn, Wenjun Zeng, Nithish Kannen, Rich Galt, Kartikeya Badola, Been Kim, Zi Wang Link
20 1000 Generative Ghosts: Anticipating Benefits and Risks of AI Afterlives As AI systems quickly improve in both breadth and depth of performance, they lend themselves to cre… Meredith Ringel Morris and Jed R. Brubaker Link
21 1000 Flow-Lenia: Emergent evolutionary dynamics in mass conservative continuous cellular automata Publi… Central to the artificial life endeavour is the creation of artificial systems spontaneously genera… Erwan Plantec, Gautier Hamon, Mayalen Etcheverry, Bert Wang-Chak Chan, Pierre-Yves Oudeyer, ClĆ©ment… Link
22 1000 Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3 The introduction of AlphaFold 21 has spurred a revolution in modelling the structure of proteins an… Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneber… Link
23 1000 Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA Large language models (LLMs) are computationally expensive to deploy. Parameter sharing offers a pr… Sangmin Bae, Adam Fisch, Hrayr Harutyunyan, Ziwei Ji, Seungyeon Kim, Tal Schuster Link
24 998 Multi-Scale Context Aggregation by Dilated Convolutions This paper introduces dilated convolutions (also known as atrous convolutions) as a method to aggre… Fisher Yu, Vladlen Koltun Link
25 988 Understanding LSTM Networks This article explains Long Short-Term Memory (LSTM) networks, a type of Recurrent Neural Network (R… Christopher Olah Link
26 986 Deep Residual Learning for Image Recognition This paper introduces Deep Residual Networks (ResNets), an architecture designed to ease the traini… Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun Link
27 985 Identity Mappings in Deep Residual Networks This paper analyzes the critical role of identity mappings within the shortcut connections of Deep … Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun Link
28 971 Neural Machine Translation by Jointly Learning to Align and Translate This paper addresses limitations in traditional neural machine translation (NMT) encoder-decoder mo… Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio Link
29 970 Order Matters: Sequence to Sequence for Sets This paper investigates the impact of input and output element order on sequence-to-sequence (seq2s… Oriol Vinyals, Samy Bengio, Manjunath Kudlur Link
30 968 GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism This paper introduces GPipe, a library facilitating the training of extremely large neural networks… Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Mia Xu Chen, Dehao Chen, HyoukJoong Lee, Ji… Link
31 954 Neural Turing Machines This paper introduces Neural Turing Machines (NTMs), a neural network architecture augmented with a… Alex Graves, Greg Wayne, Ivo Danihelka Link
32 941 Pointer Networks This paper introduces Pointer Networks (Ptr-Nets), a neural architecture designed to address the li… Oriol Vinyals, Meire Fortunato, Navdeep Jaitly Link
33 926 Neural Message Passing for Quantum Chemistry This paper introduces Message Passing Neural Networks (MPNNs) as a framework for predicting quantum… Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, George E. Dahl Link