Extended thinking is a built-in model capability where the model reasons internally before responding. It is enabled by default on the most capable and standard models. You do not need to prompt for it — you control it through effort levels.
How effort levels work
Effort levels control how much thinking the model allocates to each response. Lower effort is faster and cheaper. Higher effort provides deeper reasoning at the cost of speed and tokens.
| Level | Behavior |
|---|---|
| low | Minimal thinking. Fast responses for straightforward tasks. |
| medium | Default. Balanced reasoning depth for most coding work. |
| high | Deep reasoning. Best for complex debugging and architecture. |
| max | Deepest reasoning with no token constraint. Most capable model only. |
Medium is the default on both the most capable and standard models. This is the recommended level for most coding tasks — higher levels can cause the model to overthink routine work.
Setting effort
Interactive command:
/effort high
Run /effort low, /effort medium, /effort high, or /effort max to change the level for the current session. Run /effort auto to reset to the model default.
Keyboard shortcut:
Press Option+T (macOS) or Alt+T (other platforms) to toggle extended thinking on or off. This may require terminal configuration to enable Option key shortcuts (iTerm2: Profiles → Keys → set Option key to “Esc+”).
At startup:
agent --effort high
Pass the effort level when launching your AI coding agent. This sets it for that session only.
Environment variable:
export AGENT_EFFORT_LEVEL=high
This takes precedence over all other methods. Valid values: low, medium, high, max, auto.
In the model picker:
Use left/right arrow keys to adjust the effort slider when selecting a model with /model. The current effort level is displayed next to the logo and spinner so you can confirm the active setting.
One-off deep reasoning with “ultrathink”
Include the word “ultrathink” in your prompt to trigger high effort for a single turn without changing your session setting:
ultrathink about the race condition in the connection pool.
What happens when two goroutines call Close() simultaneously?
This is useful when you are generally working at medium effort but hit a problem that needs deeper reasoning.
Viewing thinking output
By default, the model’s internal thinking is not displayed. Press Ctrl+O to toggle verbose mode, which shows detailed tool usage and thinking output. You can also launch with --verbose for the same effect.
When higher effort helps
- Complex architecture decisions: Evaluating tradeoffs across multiple components and their interactions
- Multi-step refactoring: Changes that ripple across many files where ordering and side effects matter
- Tricky debugging: Tracing race conditions, concurrency bugs, or subtle logic errors through multiple layers
- Security review: Reasoning about attack vectors, edge cases, and interactions between trust boundaries
- Algorithm design: Problems where the first intuitive approach is likely wrong or suboptimal
When it is overkill
- Simple boilerplate and CRUD operations
- Renaming variables, fixing typos, adding log lines
- Tasks with one obvious approach
- Quick lookups, explanations, or codebase questions
For these, medium or low effort produces the same result faster and cheaper.
Common misconception: prompt-based thinking
Phrases like “think step by step” or “think carefully” do not allocate additional thinking tokens. They are regular prompt text. To actually increase reasoning depth, use /effort or include “ultrathink” in your prompt.
Programmatic control
For CI/CD pipelines and scripts, two environment variables give direct control:
AGENT_EFFORT_LEVEL— Sets the effort level. Takes precedence over/effortand theeffortLevelsetting.MAX_THINKING_TOKENS— Overrides the thinking token budget directly. On models with adaptive reasoning, this is ignored unless you also setAGENT_DISABLE_ADAPTIVE_THINKING=1to revert to fixed budgets. Set to0to disable thinking entirely.
Practical examples
Default workflow — leave it at medium:
Most of the time, you do not need to change anything. Medium effort handles everyday coding, test writing, and code review well.
Bump effort for a hard problem:
/effort high
The payment webhook handler sometimes processes the same event twice.
Trace through WebhookController, PaymentService, and IdempotencyStore
to find where the deduplication logic fails under concurrent requests.
Drop effort for bulk work:
/effort low
Add JSDoc comments to every exported function in src/utils/.
One-off deep dive without changing session effort:
ultrathink about whether this migration can run without downtime.
Consider what happens to in-flight queries during the ALTER TABLE.
Tips
- Start at medium. Only increase effort when you notice the agent missing subtleties or making shallow analysis.
- Use
/effort lowfor repetitive, mechanical tasks to save tokens and time. - The
maxlevel is reserved for the most capable model and does not persist across sessions unless set viaAGENT_EFFORT_LEVEL. - Effort can be set per-skill or per-subagent using the
effortfield in frontmatter, useful for giving a security review skill higher effort than a formatting skill.