Skip to content

UpdatedDaily/250Papers/CoT/VLM/Quantization/Grounding/Text2IMG&VID/Prompt/Reasoning/Robot/Agent/Planning/RL/Feedback/In-Context-Learning/Instruction-Tuning/PEFT/RLHF/RAG/Embodied/VQA/Hallucination/Distilling

Notifications You must be signed in to change notification settings

assistaplus/assp-Awesome-LLM-Papers-Toward-AGI

 
 

Repository files navigation

World's Most Comprehensive Curated List of LLM Papers & Repositories[Project Page][Notion]

Target fields

CoT / VLM / Quantization / Grounding / Text2IMG&VID / Prompt / Reasoning / Robot / Agent / Planning / RL / Feedback / InContextLearning / InstructionTuning / PEFT / RLHF / RAG / Embodied / VQA / Hallucination / Diffusion / Scaling / ContextWindow / WorldModel / Memory / ZeroShot /

Mission

Our mission is to provide a wide selection of the latest, high-quality papers in the rapidly evolving field of LLM (Large Language Models) that are organized and essential for researchers striving to stay abreast of developments in the field.

Notion Table

image

Awesome repos

Awesome-Multimodal-Large-Language-Models
Awesome-LLM-Robotics
Awesome-Multimodal-Reasoning
LLM-Reasoning-Papers
LLMSurvey
Awesome_Multimodel_LLM

Awesome surveys

A Survey of Large Language Models
A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Towards Reasoning in Large Language Models: A Survey
A Survey on Large Language Model based Autonomous Agents
The Rise and Potential of Large Language Model Based Agents: A Survey
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Language-conditioned Learning for Robotic Manipulation: A Survey Foundation Models in Robotics: Applications, Challenges, and the Future
The Development of LLMs for Embodied Navigation
LLM Powered Autonomous Agents
Awesome-Embodied-Agent-with-LLMs

LLM

Subcategory Models Linked Title Publication Date
Open sourced LLM BERT BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 11 Oct 2018
T5 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 23 Oct 2019
LLaMA LLaMA: Open and Efficient Foundation Language Models 27 Feb 2023
OpenFlamingo OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models 2 Aug 2023
InstructBLIP InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning 11 May 2023
ChatBridge ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst 25 May 2023
Closed sourced LLM GPT3 Language Models are Few-Shot Learners 28 May 2020
GPT4 GPT-4 Technical Report 15 Mar 2023
Instruction Turning InstructGPT Training language models to follow instructions with human feedback 4 Mar 2022
LLaVA Visual Instruction Tuning 17 Apr 2023
LLaVA Visual Instruction Tuning 17 Apr 2023
MiniGPT-4 MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models 20 Apr 2023
FLAN Finetuned Language Models Are Zero-Shot Learners 3 Sep 2021
LLaMA-adapter LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention 28 Mar 2023
Self-Instruct Self-Instruct: Aligning Language Models with Self-Generated Instructions 20 Dec 2022
Vision-LLM LLaVA Visual Instruction Tuning 17 Apr 2023
GPT4-V OpenFlamingo OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models 2 Aug 2023
InternGPT InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language 9 May 2023
PaLM PaLM: Scaling Language Modeling with Pathways 5 Apr 2022
Spatial Understanding Gpt-driver GPT-Driver: Learning to Drive with GPT 2 Oct 2023
Path planners Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning 5 Oct 2023
Visual Question Answering CogVLM CogVLM: Visual Expert for Pretrained Language Models 6 Nov 2023
ViperGPT ViperGPT: Visual Inference via Python Execution for Reasoning 14 Mar 2023
VISPROG Visual Programming: Compositional visual reasoning without training 18 Nov 2022
MM-ReAct MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action 20 Mar 2023
Chameleon Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models 19 Apr 2023
Caption Anything Caption Anything: Interactive Image Description with Diverse Multimodal Controls 4 May 2023
Temporal Logics NL2TL NL2TL: Transforming Natural Languages to Temporal Logics using Large Language Models 12 May 2023
Quantitive Analysis GPT4Vis GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition? 27 Nov 2023
Gemini vs GPT-4V Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases 22 Dec 2023
Survey Papers A Survey of Large Language Models A Survey of Large Language Models 31 Mar 2023

In-Context Learning

Subcategory Models Linked Title Publication Date
Chain of Thought Chain of Thought Chain-of-Thought Prompting Elicits Reasoning in Large Language Models 28 Jan 2022
Tree of Thought Tree of Thoughts: Deliberate Problem Solving with Large Language Models 17 May 2023
Multimodal-CoT Multimodal Chain-of-Thought Reasoning in Language Models 2 Feb 2023
Auto-CoT Automatic Chain of Thought Prompting in Large Language Models 7 Oct 2022
Verify-and-Edit Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework 5 May 2023
Skeleton-of-Thought Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding 28 Jul 2023
Rethinking with Retrieval Rethinking with Retrieval: Faithful Large Language Model Inference 31 Dec 2022
Reasoning Self-Consistency Self-Consistency Improves Chain of Thought Reasoning in Language Models 21 Mar 2022
ReAct ReAct: Synergizing Reasoning and Acting in Language Models 20 Mar 2023
Self-Refine Self-Refine: Iterative Refinement with Self-Feedback 30 Mar 2023
Plan-and-Solve Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models 6 May 2023
PAL PAL: Program-aided Language Models 18 Nov 2022
Reasoning via Planning Reasoning with Language Model is Planning with World Model 24 May 2023
Self-Ask Measuring and Narrowing the Compositionality Gap in Language Models 7 Oct 2022
Least-to-Most Prompting Least-to-Most Prompting Enables Complex Reasoning in Large Language Models 21 May 2022
Self-Polish Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement 23 May 2023
COMPLEXITY-CoT Complexity-Based Prompting for Multi-Step Reasoning 3 Oct 2022
Maieutic Prompting Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations 24 May 2022
Algorithm of Thoughts Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models 20 Aug 2023
SuperICL Small Models are Valuable Plug-ins for Large Language Models 15 May 2023
VisualCOMET VisualCOMET: Reasoning about the Dynamic Context of a Still Image 22 Apr 2020
Memory MemoryBank MemoryBank: Enhancing Large Language Models with Long-Term Memory 17 May 2023
ChatEval ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate 14 Aug 2023
Generative Agents Generative Agents: Interactive Simulacra of Human Behavior 7 Apr 2023
Planning SelfCheck SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning 1 Aug 2023
Automation APE Large Language Models Are Human-Level Prompt Engineers 3 Nov 2022
Self-supervised Self-supervised ICL SINC: Self-Supervised In-Context Learning for Vision-Language Tasks 15 Jul 2023
Benchmark BIG-Bench Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models 9 Jun 2022
ARB ARB: Advanced Reasoning Benchmark for Large Language Models 25 Jul 2023
PlanBench PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change 21 Jun 2022
Chain-of-Thought Hub Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance 26 May 2023
Survey Paper A Survey of Chain of Thought Reasoning A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future 27 Sep 2023
Reasoning in Large Language Models Towards Reasoning in Large Language Models: A Survey 20 Dec 2022

LLM for Agent

Subcategory Models Linked Title Publication Date
Planning Voyager Voyager: An Open-Ended Embodied Agent with Large Language Models 25 May 2023
DEPS Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents 3 Feb 2023
JARVIS-1 JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models 10 Nov 2023
LLM+P LLM+P: Empowering Large Language Models with Optimal Planning Proficiency 22 Apr 2023
Autonomous Agents A Survey on Large Language Model based Autonomous Agents 22 Aug 2023
AgentInstruct Agent Instructs Large Language Models to be General Zero-Shot Reasoners 5 Oct 2023
Reinforcement Learning Eureka Eureka: Human-Level Reward Design via Coding Large Language Models 19 Oct 2023
Language to Rewards Language to Rewards for Robotic Skill Synthesis 14 Jun 2023
Language Instructed Reinforcement Learning Language Instructed Reinforcement Learning for Human-AI Coordination 13 Apr 2023
Lafite-RL Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models 4 Nov 2023
ELLM Guiding Pretraining in Reinforcement Learning with Large Language Models 13 Feb 2023
RLAdapter RLAdapter: Bridging Large Language Models to Reinforcement Learning in Open Worlds nan
AdaRefiner AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback 29 Sep 2023
Reward Design with Language Models Reward Design with Language Models 27 Feb 2023
EAGER EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL 20 Jun 2022
Text2Reward Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning 20 Sep 2023
Open-Source Evaluation AgentSims AgentSims: An Open-Source Sandbox for Large Language Model Evaluation 8 Aug 2023
Survey Paper Large Language Model Based Agents The Rise and Potential of Large Language Model Based Agents: A Survey 14 Sep 2023

LLM for Robots

Subcategory Models Linked Title Publication Date
Multimodal prompts VIMA VIMA: General Robot Manipulation with Multimodal Prompts 6 Oct 2022
Instruct2Act Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model 18 May 2023
MOMA-Force MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation 7 Aug 2023
Multimodal LLM PaLM-E PaLM-E: An Embodied Multimodal Language Model 6 Mar 2023
GATO A Generalist Agent 12 May 2022
Flamingo Flamingo: a Visual Language Model for Few-Shot Learning 29 Apr 2022
Physically Grounded Vision-Language Model Physically Grounded Vision-Language Models for Robotic Manipulation 5 Sep 2023
MOO Open-World Object Manipulation using Pre-trained Vision-Language Models 2 Mar 2023
Code generation Code as policies Code as Policies: Language Model Programs for Embodied Control 16 Sep 2022
Progprompt ProgPrompt: Generating Situated Robot Task Plans using Large Language Models 22 Sep 2022
Socratic Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language 1 Apr 2022
SMART-LLM SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models 18 Sep 2023
Statler Statler: State-Maintaining Language Models for Embodied Reasoning 30 Jun 2023
Decomposing task SayCan Do As I Can, Not As I Say: Grounding Language in Robotic Affordances 4 Apr 2022
Language Models as Zero-Shot Planners Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents 18 Jan 2022
SayPlan SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning 12 Jul 2023
DOREMI DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment 1 Jul 2023
Low-level output SayTap SayTap: Language to Quadrupedal Locomotion 13 Jun 2023
Prompt a Robot to Walk Prompt a Robot to Walk with Large Language Models 18 Sep 2023
Multimodal Data injection 3D-LLM 3D-LLM: Injecting the 3D World into Large Language Models 24 Jul 2023
LiDAR-LLM LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding 21 Dec 2023
Data generation Gensim GenSim: Generating Robotic Simulation Tasks via Large Language Models 2 Oct 2023
RoboGen RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation 2 Nov 2023
Planning Embodied Task Planning Embodied Task Planning with Large Language Models 4 Jul 2023
Self-improvement REFLECT REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction 27 Jun 2023
Reflexion Reflexion: Language Agents with Verbal Reinforcement Learning 20 Mar 2023
Chain of Thought EmbodiedGPT EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought 24 May 2023
Brain Robotic Brain LLM as A Robotic Brain: Unifying Egocentric Memory and Control 19 Apr 2023
Survey papers Toward General-Purpose Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis 14 Dec 2023
Language-conditioned Language-conditioned Learning for Robotic Manipulation: A Survey 17 Dec 2023
Foundation Models Foundation Models in Robotics: Applications, Challenges, and the Future 13 Dec 2023
Robot Learning Robot Learning in the Era of Foundation Models: A Survey 24 Nov 2023
The Development of LLMs The Development of LLMs for Embodied Navigation 1 Nov 2023

Perception

Subcategory Models Linked Title Publication Date
Object Detection OWL-ViT Simple Open-Vocabulary Object Detection with Vision Transformers 12 May 2022
GLIP Grounded Language-Image Pre-training 7 Dec 2021
Grounding DINO Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection 9 Mar 2023
PointCLIP PointCLIP: Point Cloud Understanding by CLIP 4 Dec 2021
Segment Anything Segment Anything 5 Apr 2023

About

UpdatedDaily/250Papers/CoT/VLM/Quantization/Grounding/Text2IMG&VID/Prompt/Reasoning/Robot/Agent/Planning/RL/Feedback/In-Context-Learning/Instruction-Tuning/PEFT/RLHF/RAG/Embodied/VQA/Hallucination/Distilling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published