Awesome AI Papers ⭐️

Description

This repository is an up-to-date list of significant AI papers organized by publication date. It covers five fields : computer vision, natural language processing, audio processing, multimodal learning and reinforcement learning. Feel free to give this repository a star if you enjoy the work.

Maintainer: Aimerou Ndiaye

Taxonomy

To select the most relevant papers, we chose subjective limits in terms of number of citations. Each icon here designates a paper type that meets one of these criteria.

🏆 Historical Paper : more than 10k citations and a decisive impact in the evolution of AI.

⭐ Important Paper : more than 50 citations and state of the art results.

⏫ Trend : 1 to 50 citations, recent and innovative paper with growing adoption.

📰 Important Article : decisive work that was not accompanied by a research paper.

2023 Papers

2022 Papers

Computer Vision

⭐ 01/2022: A ConvNet for the 2020s (ConvNeXt)
⭐ 01/2022: Patches Are All You Need (ConvMixer)
⭐ 02/2022: Block-NeRF: Scalable Large Scene Neural View Synthesis (Block-NeRF)
⭐ 03/2022: DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection (DINO)
⭐ 03/2022: Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs (Large Kernel CNN)
⭐ 03/2022: TensoRF: Tensorial Radiance Fields (TensoRF)
⭐ 04/2022: MaxViT: Multi-Axis Vision Transformer (MaxViT)
⭐ 04/2022: Hierarchical Text-Conditional Image Generation with CLIP Latents (DALL-E 2)
⭐ 05/2022: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding (Imagen)
⭐ 05/2022: GIT: A Generative Image-to-text Transformer for Vision and Language (GIT)
⭐ 06/2022: CMT: Convolutional Neural Network Meet Vision Transformers (CMT)
⭐ 07/2022: Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors... (Swin UNETR)
⭐ 07/2022: Classifier-Free Diffusion Guidance
⭐ 08/2022: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation (DreamBooth)
⭐ 09/2022: DreamFusion: Text-to-3D using 2D Diffusion (DreamFusion)
⭐ 09/2022: Make-A-Video: Text-to-Video Generation without Text-Video Data (Make-A-Video)
⭐ 10/2022: On Distillation of Guided Diffusion Models
⭐ 10/2022: LAION-5B: An open large-scale dataset for training next generation image-text models (LAION-5B)
⭐ 10/2022: Imagic: Text-Based Real Image Editing with Diffusion Models (Imagic)
⭐ 11/2022: Visual Prompt Tuning
⭐ 11/2022: Magic3D: High-Resolution Text-to-3D Content Creation (Magic3D)
⭐ 11/2022: DiffusionDet: Diffusion Model for Object Detection (DiffusionDet)
⭐ 11/2022: InstructPix2Pix: Learning to Follow Image Editing Instructions (InstructPix2Pix)
⭐ 12/2022: Multi-Concept Customization of Text-to-Image Diffusion (Custom Diffusion)
⭐ 12/2022: Scalable Diffusion Models with Transformers (DiT)

NLP

⭐ 01/2022: LaMBDA: Language Models for Dialog Applications (LaMBDA)
⭐ 01/2022: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (CoT)
⭐ 02/2022: Competition-Level Code Generation with AlphaCode (AlphaCode)
⭐ 02/2022: Finetuned Language Models Are Zero-Shot Learners (FLAN)
⭐ 03/2022: Training language models to follow human instructions with human feedback (InstructGPT)
⭐ 03/2022: Multitask Prompted Training Enables Zero-Shot Task Generalization (T0)
⭐ 03/2022: Training Compute-Optimal Large Language Models (Chinchilla)
⭐ 04/2022: Do As I Can, Not As I Say: Grounding Language in Robotic Affordances (SayCan)
⭐ 04/2022: GPT-NeoX-20B: An Open-Source Autoregressive Language Model (GPT-NeoX)
⭐ 04/2022: PaLM: Scaling Language Modeling with Pathways (PaLM)
⭐ 06/2022: Beyond the Imitation Game: Quantifying and extrapolating the capabilities of lang... (BIG-bench)
⭐ 06/2022: Solving Quantitative Reasoning Problems with Language Models (Minerva)
⭐ 10/2022: ReAct: Synergizing Reasoning and Acting in Language Models (ReAct)
⭐ 11/2022: BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (BLOOM)
📰 11/2022: Optimizing Language Models for Dialogue (ChatGPT)
⭐ 12/2022: Large Language Models Encode Clinical Knowledge (Med-PaLM)

Audio Processing

⭐ 02/2022: mSLAM: Massively multilingual joint pre-training for speech and text (mSLAM)
⭐ 02/2022: ADD 2022: the First Audio Deep Synthesis Detection Challenge (ADD)
⭐ 03/2022: Efficient Training of Audio Transformers with Patchout (PaSST)
⭐ 04/2022: MAESTRO: Matched Speech Text Representations through Modality Matching (Maestro)
⭐ 05/2022: SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language... (SpeechT5)
⭐ 06/2022: WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing (WavLM)
⭐ 07/2022: BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for ASR (BigSSL)
⭐ 08/2022: MuLan: A Joint Embedding of Music Audio and Natural Language (MuLan)
⭐ 09/2022: AudioLM: a Language Modeling Approach to Audio Generation (AudioLM)
⭐ 09/2022: AudioGen: Textually Guided Audio Generation (AudioGen)
⭐ 10/2022: High Fidelity Neural Audio Compression (EnCodec)
⭐ 12/2022: Robust Speech Recognition via Large-Scale Weak Supervision (Whisper)

Multimodal Learning

⭐ 01/2022: BLIP: Boostrapping Language-Image Pre-training for Unified Vision-Language... (BLIP)
⭐ 02/2022: data2vec: A General Framework for Self-supervised Learning in Speech, Vision and... (Data2vec)
⭐ 03/2022: VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks (VL-Adapter)
⭐ 04/2022: Winoground: Probing Vision and Language Models for Visio-Linguistic... (Winoground)
⭐ 04/2022: Flamingo: a Visual Language Model for Few-Shot Learning (Flamingo)
⭐ 05/2022: A Generalist Agent (Gato)
⭐ 05/2022: CoCa: Contrastive Captioners are Image-Text Foundation Models (CoCa)
⭐ 05/2022: VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts (VLMo)
⭐ 08/2022: Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks (BEiT)
⭐ 09/2022: PaLI: A Jointly-Scaled Multilingual Language-Image Model (PaLI)

Reinforcement Learning

⭐ 01/2022: Learning robust perceptive locomotion for quadrupedal robots in the wild
⭐ 02/2022: BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning
⭐ 02/2022: Outracing champion Gran Turismo drivers with deep reinforcement learning (Sophy)
⭐ 02/2022: Magnetic control of tokamak plasmas through deep reinforcement learning
⭐ 08/2022: Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning (ANYmal)
⭐ 10/2022: Discovering faster matrix multiplication algorithms with reinforcement learning (AlphaTensor)

Other Papers

⭐ 02/2022: FourCastNet: A Global Data-driven High-resolution Weather Model... (FourCastNet)
⭐ 05/2022: ColabFold: making protein folding accessible to all (ColabFold)
⭐ 06/2022: Measuring and Improving the Use of Graph Information in GNN
⭐ 10/2022: TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis (TimesNet)
⭐ 12/2022: RT-1: Robotics Transformer for Real-World Control at Scale (RT-1)

Historical Papers

🏆 1958: Perceptron: A probabilistic model for information storage and organization in the brain (Perceptron)
🏆 1986: Learning representations by back-propagating errors (Backpropagation)
🏆 1986: Induction of decision trees (CART)
🏆 1989: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition (HMM)
🏆 1989: Multilayer feedforward networks are universal approximators
🏆 1992: A training algorithm for optimal margin classifiers (SVM)
🏆 1996: Bagging predictors
🏆 1998: Gradient-based learning applied to document recognition (CNN/GTN)
🏆 2001: Random Forests
🏆 2001: A fast and elitist multiobjective genetic algorithm (NSGA-II)
🏆 2003: Latent Dirichlet Allocation (LDA)
🏆 2006: Reducing the Dimensionality of Data with Neural Networks (Autoencoder)
🏆 2008: Visualizing Data using t-SNE (t-SNE)
🏆 2009: ImageNet: A large-scale hierarchical image database (ImageNet)
🏆 2012: ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)
🏆 2013: Efficient Estimation of Word Representations in Vector Space (Word2vec)
🏆 2013: Auto-Encoding Variational Bayes (VAE)
🏆 2014: Generative Adversarial Networks (GAN)
🏆 2014: Dropout: A Simple Way to Prevent Neural Networks from Overfitting (Dropout)
🏆 2014: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
🏆 2014: Adam: A Method for Stochastic Optimization (Adam)
🏆 2015: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Cov... (BatchNorm)
🏆 2015: Going Deeper With Convolutions (Inception)
🏆 2015: Human-level control through deep reinforcement learning (Deep Q Network)
🏆 2015: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (Faster R-CNN)
🏆 2015: U-Net: Convolutional Networks for Biomedical Image Segmentation (U-Net)
🏆 2015: Deep Residual Learning for Image Recognition (ResNet)
🏆 2016: You Only Look Once: Unified, Real-Time Object Detection (YOLO)
🏆 2017: Attention is All you Need (Transformer)
🏆 2018: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (BERT)
🏆 2020: Language Models are Few-Shot Learners (GPT-3)
🏆 2020: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT)
🏆 2021: Highly accurate protein structure prediction with AlphaFold (Alphafold)
📰 2022: ChatGPT: Optimizing Language Models For Dialogue (ChatGPT)

Name		Name	Last commit message	Last commit date
Latest commit History 301 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome AI Papers ⭐️

Description

Table of Contents

Taxonomy

2023 Papers

Computer Vision

NLP

Audio Processing

Multimodal Learning

Reinforcement Learning

Other Papers

2022 Papers

Computer Vision

NLP

Audio Processing

Multimodal Learning

Reinforcement Learning

Other Papers

Historical Papers

About

Releases

Packages

kdhrepos/AI-Papers

Folders and files

Latest commit

History

Repository files navigation

Awesome AI Papers ⭐️

Description

Table of Contents

Taxonomy

2023 Papers

Computer Vision

NLP

Audio Processing

Multimodal Learning

Reinforcement Learning

Other Papers

2022 Papers

Computer Vision

NLP

Audio Processing

Multimodal Learning

Reinforcement Learning

Other Papers

Historical Papers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages