> ## Documentation Index
> Fetch the complete documentation index at: https://help.pantaos.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Prompting Techniques

> The named methods every practitioner should know: zero shot, few shot, chain of thought, role prompting, prefilling, and prompt chaining.

## Overview

The fundamentals tell you what a good prompt contains. The techniques below tell you how to compose those contents for specific kinds of tasks. Each technique has a specific use case, a specific failure mode, and a specific cost in tokens. Knowing when to reach for which is most of the craft.

<CardGroup cols={2}>
  <Card title="Zero Shot Prompting" icon="zap">
    Ask directly without examples. Best for simple, well known tasks.
  </Card>

  <Card title="Few Shot Prompting" icon="copy">
    Include two to five examples in the prompt. Best for formatting and classification.
  </Card>

  <Card title="Chain of Thought" icon="brain">
    Ask the model to reason step by step. Best for math, logic, and multi step decisions.
  </Card>

  <Card title="Role Prompting" icon="user">
    Assign a persona or expertise. Best for tone control and domain specific tasks.
  </Card>

  <Card title="Prefilling" icon="pencil">
    Start the assistant turn for the model. Best for enforcing format and bypassing preamble.
  </Card>

  <Card title="Prompt Chaining" icon="link">
    Split a hard task across multiple prompts. Best for long workflows and complex pipelines.
  </Card>
</CardGroup>

## Zero Shot Prompting

Zero shot is the default mode: you describe the task and the model performs it without examples. Modern instruction tuned models are remarkably strong at zero shot for tasks that are common in their training data: summarization, translation, sentiment classification, basic extraction, basic reasoning.

```text theme={"theme":"github-dark"}
Translate the following English sentence to formal German:

"The committee will reconvene on Tuesday to review the proposal."
```

When zero shot works, use it. It is cheaper, easier to maintain, and scales better than prompts loaded with examples. Reach for few shot only when zero shot is producing inconsistent or wrongly formatted output.

A useful mental check: if you can describe the task in one sentence and the format in one sentence, try zero shot first. Add examples only when you have evidence that they help.

## Few Shot Prompting

Few shot prompting includes two to five examples of input and output in the prompt itself. The model infers the pattern from the examples and applies it to the new input. Few shot is especially powerful for:

* **Classification with custom labels.** When your taxonomy is not standard, examples are faster than describing the labels in prose.
* **Format enforcement.** When the output must follow a precise structure, an example is worth several paragraphs of instructions.
* **Style transfer.** When you want a specific voice, two well chosen examples capture it better than adjectives.

```text theme={"theme":"github-dark"}
Classify the following customer messages as billing, technical, or feedback.

Message: My invoice for March shows the wrong amount.
Category: billing

Message: The export to PDF crashes on documents over 50 pages.
Category: technical

Message: Love the new dashboard, the speed is night and day.
Category: feedback

Message: I cannot log in after the update.
Category:
```

For most tasks, three to five examples is the right number. Two is sometimes enough; more than five rarely helps and starts to bias the model toward surface patterns.

The examples should be diverse, not redundant. Three examples covering the same easy case teach the model nothing about the hard cases. Pick examples that span the boundaries of the task: an obvious case, a tricky edge case, and one that includes the kind of input that has previously caused failures.

A subtle but important point: for reasoning heavy tasks on the latest models, few shot examples can hurt performance by anchoring the model to surface patterns instead of letting it reason fresh. Use few shot for classification and formatting; prefer zero shot with chain of thought for math and logic.

## Chain of Thought

Chain of thought, often abbreviated CoT, is the technique of asking the model to produce its reasoning before its answer. The effect is large, well documented, and persists across model generations. Even the simple instruction "Let's think step by step" produces a measurable improvement on reasoning tasks.

The simplest form is zero shot CoT:

```text theme={"theme":"github-dark"}
A bag contains 12 red marbles and 8 blue marbles. If I draw 3 marbles without
replacement, what is the probability that all three are red? Think step by step,
then state the final answer on a new line beginning with "Answer:".
```

For harder problems, structured CoT works better. Wrap the reasoning in XML tags so the reasoning and the answer can be separated programmatically:

```text theme={"theme":"github-dark"}
Solve the problem below.

<problem>
{problem}
</problem>

First, work through the solution inside <thinking> tags, showing each step.
Then, give the final answer inside <answer> tags.
```

A few rules of thumb that hold across providers:

* **Use CoT when the task has multiple steps.** Math, multi hop reasoning, planning, debugging.
* **Do not use CoT for retrieval or simple classification.** It adds tokens and rarely helps.
* **Modern reasoning models do CoT internally.** The latest reasoning oriented models reason before responding by default. Adding "think step by step" to a reasoning model is redundant and sometimes counterproductive.
* **Separate reasoning from answer.** Always specify how the final answer is delivered, so you can parse it without parsing the reasoning.

## Role Prompting

Setting a role at the start of the prompt, usually in the system message, focuses the model's voice, vocabulary, and depth of knowledge for the entire conversation. Role prompting is one of the highest leverage techniques for steering model behavior, which is why every PANTA OS assistant starts with a clearly defined role in its system prompt.

```text theme={"theme":"github-dark"}
You are a senior contract lawyer specializing in software licensing. Your job is to
flag risks in agreements, not to give legal advice. You write in plain English for
a non lawyer audience and you cite the specific clause number for every risk you raise.
```

A few principles for effective role prompting:

* **Be specific about expertise.** "You are an expert" is generic. "You are a clinical pharmacist with twenty years of experience in oncology" is steerable.
* **Include the audience.** The role should specify who the output is for, not just who is producing it.
* **State the boundary.** "Your job is X, not Y" prevents the most common drift, where the model takes on adjacent responsibilities you did not want.
* **Keep it short.** A two to four sentence role is enough. Longer roles tend to introduce contradictions.

Role prompting and few shot examples compose well: the role sets the voice, the examples set the format. Use both when you need consistent output in a specific persona.

## Prefilling

Prefilling is the technique of starting the assistant's response yourself, so the model continues from where you stopped. It is a powerful but underused method for enforcing format and skipping unwanted preamble.

When the API supports it, send an assistant message with the desired beginning. The model continues from there.

```text theme={"theme":"github-dark"}
User: Extract the company name, total amount, and due date from the invoice below.
Return the result as a JSON object.

Invoice:
{invoice}
```

Then send an assistant message that begins with `{` so the model continues with the JSON object instead of writing a preamble like "Here is the extracted information:".

The most common uses of prefilling:

* **Force JSON output.** Prefill with `{` so the model is committed to a JSON response.
* **Force a specific structure.** Prefill with `<analysis>` so the model continues inside the tag.
* **Force a refusal pattern.** Prefill with "I cannot help with" to bias toward declining a borderline request.
* **Speak in character.** Prefill with the character's name to keep a roleplay assistant in voice.

Prefilling support varies by provider. Some APIs allow it natively through the messages parameter; others require a workaround.

## Prompt Chaining

When a task is too complex for a single prompt, split it into a chain of prompts where the output of one becomes the input of the next. Use chaining whenever a single prompt has more than two distinct stages, since each stage gets the model's full attention rather than competing with the others for it.

A typical chain has three stages:

1. **Extraction.** Pull the relevant information out of the source.
2. **Transformation.** Apply the logic, classification, or analysis to the extracted information.
3. **Composition.** Format the result for the destination.

```text theme={"theme":"github-dark"}
Stage 1
Read the meeting transcript and produce a JSON list of decisions, action items, and
open questions.

Stage 2
For each action item from stage 1, identify the owner, the due date, and any
dependencies on other action items.

Stage 3
Compose a follow up email summarizing the meeting, organized by owner, with all
action items and their due dates.
```

Chaining is the design principle behind PANTA OS Apps: every App breaks a complex output into a sequence of well defined steps, each with its own prompt, often with a Human in the Loop checkpoint between stages. The advantages over a single mega prompt are real. Each stage is easier to debug, easier to evaluate, and easier to swap out. The cost is operational complexity: you now have several prompts to maintain instead of one.

A useful heuristic: if a single prompt is more than 1500 tokens of instructions and the model is producing inconsistent output, try chaining before adding more instructions. Smaller, focused prompts almost always outperform sprawling ones.

## Combining Techniques

The techniques are not mutually exclusive. Most production prompts use three or four of them at once. A canonical structure that combines role prompting, structure, examples, and chain of thought looks like this:

```text theme={"theme":"github-dark"}
You are a senior policy analyst evaluating regulatory filings. [role]

Your task is to identify whether the filing below contains any unsupported claims
about safety. [task]

Examples of supported and unsupported claims: [few shot]
<example>
<claim>The device reduces complications by 30 percent (cite: Smith et al. 2021).</claim>
<verdict>supported</verdict>
</example>
<example>
<claim>The device is the safest on the market.</claim>
<verdict>unsupported</verdict>
</example>

Filing: [input]
<filing>{filing}</filing>

Process: [chain of thought + format]
First, list each safety claim inside <claims> tags. Then, for each claim, evaluate
it inside <evaluation> tags. Finally, return your conclusion inside <verdict> tags
as a single JSON object with the keys "supported" and "unsupported", each an array
of verbatim claims.
```

This is roughly the shape of a strong production prompt. Role at the top, task next, examples in tagged blocks, input clearly delimited, and an explicit reasoning and output structure at the end.

## What Comes Next

The next page, Structured Prompts, goes deeper on the structural side: XML tags, JSON contracts, and the anatomy of a system prompt. The techniques on this page assume you can write a clean structure; the next page shows how.