Prompt engineering Skill for Codex / Claude Code

Crafts prompts with explicit task framing, role definition, output constraints, citation requirements, and few-shot examples so model responses are consistent, grounded in evidence, and actionable for downstream tasks. Prompt engineering reduces the variability and hallucination risk that comes from under-specified prompts.

Category Research

Platform Codex / Claude Code

Published 2026-04-18

promptsoptimizationresearch

Use cases

Building an API integration that calls an LLM and needing the output to be reliably parseable by the calling code
Designing a system prompt for an AI agent where consistent behavior across different conversation states is required
Creating an evaluation harness where you need model responses to be consistent enough to compare across runs
Engineering prompts for a RAG pipeline where the model must cite sources rather than answer from memory
Optimizing prompts for a high-volume use case where token efficiency directly affects cost

Key features

State the task and persona explicitly at the start of the prompt—who the model is acting as, what their expertise level is, and what goal they should pursue
Add specific output format constraints: JSON schema, markdown structure, maximum length, and the fields the output must include
Include few-shot examples for edge cases where the desired output is non-obvious or requires a specific reasoning pattern
Specify citation or grounding requirements: 'Only answer based on the provided context' or 'Cite the source document for each claim'
Iterate on the prompt by running it against a diverse test set and measuring consistency and accuracy before treating it as production-ready

When to Use This Skill

When building production systems that depend on LLM outputs being consistent and parseable
When prompt outputs feed into downstream automated processes where format errors cause failures
When evaluating different models or configurations and needing a stable, measurable prompt to compare against

Expected Output

A production-ready prompt with explicit task framing, output format specification, few-shot examples, and a measured consistency score on a test set.

Frequently Asked Questions

How do I know if a prompt is good enough for production?: Run it against a test set of at least 50 diverse inputs and measure: (1) format compliance rate, (2) factual accuracy against ground truth, (3) consistency when the same input is run multiple times. A prompt is production-ready when all three meet your application's tolerance.
What is the difference between prompt engineering and fine-tuning?: Prompt engineering shapes behavior through input text without changing the model's weights. Fine-tuning changes the model's weights through training. Start with prompt engineering—it is faster and cheaper. Move to fine-tuning when you need consistent behavior that prompt engineering cannot achieve.
How do I handle prompts that work for one model but not another?: Model-specific prompting is expected—each model has different strengths and failure modes. Maintain separate optimized prompts per model rather than assuming a prompt is portable. What works on GPT-4 may not work on Claude or Llama.

3 Indexed items

Brainstorming before build

Research

Explores goals, constraints, risks, and design options before committing to a specific implementation path. This technique is most valuable when facing product or UX decisions where the wrong choice is expensive to reverse—new features with uncertain user value, architectural pivots, or cross-functional dependencies where each team has a different mental model of the problem.

AI memory and HBM supply-chain claims due diligence

Research

Structures verification of public claims about AI-driven memory shortages, high-bandwidth memory (HBM) demand, and trillion-dollar memory-chip valuations into an evidence checklist for finance, procurement, and platform teams. The workflow separates analyst price-target moves, year-to-date equity rallies, and vendor statements about agentic-AI workloads from independently observable supply signals (long-term agreements, stated capacity constraints, peer pricing power). It cites CNBC reporting that Micron crossed a $1 trillion market cap on May 26, 2026 after UBS raised its price target from $535 to $1,625, and that SK Hynix joined the trillion-dollar club on May 27, 2026 with shares up roughly 250% year to date amid AI chip demand lifting South Korea's Kospi—without endorsing any single stock call.

OpenAI documentation lookup

Research

Prioritizes official OpenAI documentation, model cards, and API references when researching integration details, model capabilities, or API behavior changes. This avoids the noise and staleness of third-party blog posts that may summarize older model versions or incomplete information.

Prompt engineering

Use cases

Key features

When to Use This Skill

Expected Output

Frequently Asked Questions

Related

Brainstorming before build

AI memory and HBM supply-chain claims due diligence

OpenAI documentation lookup

Related news