Papers

100 research papers on prompt engineering techniques.

Year:

Showing 100 of 100 papers

The Prompt Report: A Systematic Survey of Prompting Techniques

Most comprehensive survey: taxonomy of 58 text and 40 multimodal prompting techniques from 1,500+ papers. Co-authored with OpenAI, Microsoft, Google, Stanford.

2024Major Surveys

A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications

44 techniques across application areas with per-task performance summaries.

2024Major Surveys

A Survey of Prompt Engineering Methods in LLMs for Different NLP Tasks

39 prompting methods across 29 NLP tasks.

2024Major Surveys

A Survey of Automatic Prompt Engineering: An Optimization Perspective

Formalizes auto-PE methods as discrete/continuous/hybrid optimization problems.

2025Major Surveys

Efficient Prompting Methods for Large Language Models: A Survey

Survey of efficiency-oriented prompting (compression, optimization, APE) for reducing compute and latency.

2024Major Surveys

Navigate through Enigmatic Labyrinth: A Survey of Chain of Thought Reasoning

Systematic CoT survey.

ACL 2024Major Surveys

Demystifying Chains, Trees, and Graphs of Thoughts

Unified framework for multi-prompt reasoning topologies.

2024Major Surveys

Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey

oriented Prompt Engineering for Large Language Models: A Survey](https://arxiv.org/abs/2401.14043) [2024] — Focuses on prompts designed around explicit task goals.

2024Major Surveys

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning LLMs

of-Thought for Reasoning LLMs](https://arxiv.org/abs/2503.09567) [2025] — Distinguishes Long CoT from Short CoT in o1/R1-era models.

2025Major Surveys

OPRO: Large Language Models as Optimizers

Uses LLMs as optimizers via meta-prompts; optimized prompts outperform human-designed ones by up to 50% on BBH.

NeurIPS 2024Prompt Optimization and Automatic Prompting

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Improving Pipelines](https://arxiv.org/abs/2310.03714) [2023, ICLR 2024] — Framework for programming (not prompting) LLMs with automatic prompt optimization.

ICLR 2024Prompt Optimization and Automatic Prompting

MIPRO: Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs

Stage Language Model Programs](https://arxiv.org/abs/2406.11695) [2024, EMNLP 2024] — Bayesian optimization for multi-stage LM programs; up to 13% accuracy gains.

EMNLP 2024Prompt Optimization and Automatic Prompting

TextGrad: Automatic "Differentiation" via Text

Treats compound AI systems as computation graphs with textual feedback as gradients. Published in Nature.

2024Prompt Optimization and Automatic Prompting

EvoPrompt

Evolutionary algorithm approach for automatically optimizing discrete prompts.

ACL 2024Prompt Optimization and Automatic Prompting

Meta Prompting for AI Systems

Example-agnostic structural templates formalized using category theory.

Prompt Optimization and Automatic Prompting

Prompt Engineering a Prompt Engineer (PE²)

Uses LLMs to meta-prompt themselves, refining prompts with step-by-step templates to significantly improve reasoning.

ACL FindingsPrompt Optimization and Automatic Prompting

Large Language Models Are Human-Level Prompt Engineers

Level Prompt Engineers](https://arxiv.org/abs/2211.01910) [2022] — Automatic prompt generation via APE.

2022Prompt Optimization and Automatic Prompting

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning

Based Discrete Optimization for Prompt Tuning](https://arxiv.org/abs/2302.03668) [2023]

2023Prompt Optimization and Automatic Prompting

SPO: Self-Supervised Prompt Optimization

Supervised Prompt Optimization](https://arxiv.org/abs/2502.06855) [2025] — Competitive performance at 1–6% of the cost of prior methods.

2025Prompt Optimization and Automatic Prompting

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression](https://arxiv.org/abs/2403.12968) [2024, ACL 2024] — 3x–6x faster than LLMLingua with GPT-4 data distillation.

ACL 2024Prompt Compression

LongLLMLingua

Question-aware compression for long contexts; 21.4% performance boost with 4x fewer tokens.

ACL 2024Prompt Compression

Prompt Compression for Large Language Models: A Survey

Comprehensive survey of hard and soft prompt compression methods.

2024Prompt Compression

Scaling LLM Test-Time Compute Optimally

Time Compute Optimally](https://arxiv.org/abs/2408.03314) [2024] — Shows optimal test-time compute allocation can outperform 14x larger models.

2024Reasoning Advances

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning](https://arxiv.org/abs/2501.12948) [2025] — Pure RL-trained reasoning model matching o1; open-source with distilled variants.

2025Reasoning Advances

s1: Simple Test-Time Scaling

Time Scaling](https://arxiv.org/abs/2501.19393) [2025] — SFT on just 1,000 examples creates competitive reasoning model via "budget forcing."

2025Reasoning Advances

Reasoning Language Models: A Blueprint

Systematic framework organizing reasoning LM approaches.

2025Reasoning Advances

Demystifying Long Chain-of-Thought Reasoning in LLMs

of-Thought Reasoning in LLMs](https://arxiv.org/abs/2502.03373) [2025] — Analyzes long CoT behavior in modern reasoning models.

2025Reasoning Advances

Graph of Thoughts: Solving Elaborate Problems with LLMs

Models thoughts as arbitrary graphs; 62% quality improvement over ToT on sorting.

AAAI 2024Reasoning Advances

Tree of Thoughts: Deliberate Problem Solving with LLMs

Tree search over reasoning paths.

NeurIPS 2023Reasoning Advances

Everything of Thoughts

Integrates CoT, ToT, and external solvers via MCTS.

2023Reasoning Advances

Skeleton-of-Thought

of-Thought](https://arxiv.org/abs/2307.15337) [2023] — Parallel decoding via answer skeleton generation for up to 2.69x speedup.

2023Reasoning Advances

Chain of Thought Prompting Elicits Reasoning in Large Language Models

The foundational CoT paper.

2022Reasoning Advances

Self-Consistency Improves Chain of Thought Reasoning

Consistency Improves Chain of Thought Reasoning](https://arxiv.org/abs/2203.11171) [2022] — Aggregating multiple CoT outputs for reliability.

2022Reasoning Advances

Large Language Models are Zero-Shot Reasoners

Shot Reasoners](https://arxiv.org/abs/2205.11916) [2022] — "Let's think step by step" as a zero-shot reasoning trigger.

2022Reasoning Advances

ReAct: Synergizing Reasoning and Acting in Language Models

Interleaving reasoning and tool use.

2022Reasoning Advances

Shot In-Context Learning](https://arxiv.org/abs/2404.11018) [2024, NeurIPS 2024 Spotlight] — Significant gains scaling ICL to hundreds/thousands of examples; introduces Reinforced and Unsupervised ICL.

In-Context Learning

Many-Shot In-Context Learning in Multimodal Foundation Models

Shot In-Context Learning in Multimodal Foundation Models](https://arxiv.org/abs/2405.09798) [2024] — Scales multimodal ICL to ~2,000 examples across 14 datasets.

2024In-Context Learning

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Context Learning Work?](https://arxiv.org/abs/2202.12837) [2022]

2022In-Context Learning

Fantastically Ordered Prompts and Where to Find Them

Overcoming few-shot prompt order sensitivity.

2021In-Context Learning

Calibrate Before Use: Improving Few-Shot Performance of Language Models

Shot Performance of Language Models](https://arxiv.org/abs/2102.09690) [2021]

2021In-Context Learning

Agentic Large Language Models: A Survey

Comprehensive survey organizing agentic LLMs by reasoning, acting, and interacting capabilities.

2025Agentic Prompting and Multi-Agent Systems

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Agents: A Survey of Progress and Challenges](https://arxiv.org/abs/2402.01680) [2024] — Covers profiling, communication, and growth mechanisms.

2024Agentic Prompting and Multi-Agent Systems

Multi-Agent Collaboration Mechanisms: A Survey of LLMs

Agent Collaboration Mechanisms: A Survey of LLMs](https://arxiv.org/abs/2501.06322) [2025] — Reviews debate and cooperation strategies in LLM-based multi-agent systems.

2025Agentic Prompting and Multi-Agent Systems

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Gen LLM Applications via Multi-Agent Conversation](https://arxiv.org/abs/2308.08155) [2023] — Microsoft's foundational multi-agent framework paper.

2023Agentic Prompting and Multi-Agent Systems

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-World APIs

World APIs](https://arxiv.org/abs/2307.16789) [2023, ICLR 2024] — Trains LLMs to use massive real-world API collections.

ICLR 2024Agentic Prompting and Multi-Agent Systems

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

bench: Can Language Models Resolve Real-World GitHub Issues?](https://arxiv.org/abs/2310.06770) [2023, ICLR 2024] — The benchmark driving agentic coding progress.

ICLR 2024Agentic Prompting and Multi-Agent Systems

AgentBench: Evaluating LLMs as Agents

Benchmark across 8 environments.

ICLR 2024Agentic Prompting and Multi-Agent Systems

PAL: Program-aided Language Models

aided Language Models](https://arxiv.org/abs/2211.10435) [2023] — Offloading computation to code interpreters.

2023Agentic Prompting and Multi-Agent Systems

Visual Prompting in Multimodal Large Language Models: A Survey

First comprehensive survey on visual prompting methods in MLLMs.

2024Multimodal Prompting

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V

of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V](https://arxiv.org/abs/2310.11441) [2023] — Visual markers dramatically improve visual grounding.

2023Multimodal Prompting

A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks

Language Tasks](https://arxiv.org/abs/2411.06284) [2024] — Covers text, image, video, audio MLLMs.

2024Multimodal Prompting

Multimodal Chain-of-Thought Reasoning in Language Models

of-Thought Reasoning in Language Models](https://arxiv.org/abs/2302.00923) [2023]

2023Multimodal Prompting

From Prompt Engineering to Prompt Craft

Design-research view of prompt "craft" for diffusion models.

2024Multimodal Prompting

Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of LLMs

Examines how constraining outputs to structured formats impacts reasoning performance.

2024Structured Output and Format Control

Batch Prompting: Efficient Inference with LLM APIs

2023Structured Output and Format Control

Structured Prompting: Scaling In-Context Learning to 1,000 Examples

Context Learning to 1,000 Examples](https://arxiv.org/abs/2212.06713) [2022]

2022Structured Output and Format Control

Formalizing and Benchmarking Prompt Injection Attacks and Defenses

Formal framework with systematic evaluation of 5 attacks and 10 defenses across 10 LLMs.

USENIX Security 2024Prompt Injection and Security

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

OpenAI's priority-level training for injection defense.

2024Prompt Injection and Security

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses

Realistic agent scenario benchmark.

2024Prompt Injection and Security

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents

Integrated LLM Agents](https://arxiv.org/abs/2403.02691) [2024]

2024Prompt Injection and Security

SecAlign: Defending Against Prompt Injection with Preference Optimization

DPO-based defense.

2024Prompt Injection and Security

WASP: Benchmarking Web Agent Security Against Prompt Injection

Security benchmark for web/computer-use agents.

2025Prompt Injection and Security

Many-Shot Jailbreaking

Shot Jailbreaking](https://www.anthropic.com/research/many-shot-jailbreaking) [2024] — Scaling harmful examples in long-context windows enables jailbreaking (Anthropic Technical Report).

2024Prompt Injection and Security

Constitutional AI: Harmlessness from AI Feedback

2022Prompt Injection and Security

Ignore Previous Prompt: Attack Techniques For Language Models

2022Prompt Injection and Security

Artificial Intelligence and Cybersecurity: Documented Risks, Enterprise Guardrails, and Emerging Threats in 2024–2025

2025](https://www.ijfmr.com/research-paper.php?id=62200) [2025] — Survey of real prompt-injection incidents with practical governance prompt patterns.

2025Prompt Injection and Security