AgentEngineering
GlossaryPrompting & Reasoning

Chain of Thought

A prompting technique that instructs a language model to output intermediate reasoning steps before producing a final answer, dramatically improving performance on multi-step problems.

Definition

Chain of Thought (CoT) is a prompting strategy where the model is encouraged — either by example or by explicit instruction ("think step by step") — to produce a visible sequence of reasoning steps before committing to an answer. Instead of jumping directly from input to output, the model externalizes its "scratchpad."

Why It Works

Large language models tend to produce better answers when they reason aloud. The intermediate tokens serve as working memory: each step conditions the next, reducing the chance of compounding errors that occur when the model tries to solve a complex problem in a single forward pass.

Research (Wei et al., 2022) showed that CoT delivers the largest gains on tasks requiring arithmetic, symbolic reasoning, and multi-step logic — exactly the kinds of tasks where standard prompting fails most badly.

Variants

  • Zero-shot CoT — append "Let's think step by step." to the prompt; no examples required.
  • Few-shot CoT — provide 3–8 worked examples in the prompt, each showing the full reasoning trace before the answer.
  • Self-consistency — sample multiple CoT traces, then majority-vote among the final answers to reduce variance.
  • Tree of Thought (ToT) — extend CoT to explore multiple reasoning branches simultaneously, backtracking when a branch is unproductive.

CoT in Agentic Systems

In agent frameworks, CoT is often embedded in the system prompt or surfaced as "thinking" tokens that the orchestrator can inspect. The ReAct pattern augments CoT by interleaving tool calls with reasoning steps, grounding the chain in real-world observations rather than letting it hallucinate facts.

Limitations

  • CoT increases output token count, raising latency and cost.
  • The reasoning trace may look plausible but still arrive at a wrong answer ("faithful" vs. "unfaithful" reasoning).
  • Very small models do not benefit from CoT and can be made worse by it.
ShareY