Stage 3: Draft Generation With Voice Constraints
Session 8.4 · ~5 min read
Where AI Earns Its Keep
You have research. You have an outline. You have a voice fingerprint from Module 6. You have a system prompt. Now you generate the first draft.
This is not "Hey AI, write me an article." This is a constrained generation with four inputs converging on a single output. The AI fills in the prose between your structural decisions. It is a drafter, not an author. The distinction matters because it determines what you evaluate in the output.
The Four Inputs to Draft Generation
Every draft generation call should include exactly four elements. Missing any one of them degrades the output in predictable ways.
| Input | Goes Where | What Happens Without It |
|---|---|---|
| Research brief | Context / user message | AI invents facts from training data. Hallucination rate spikes. |
| Outline | User message (structural instructions) | AI uses its default structure. Your argument gets replaced by its average. |
| Voice fingerprint | System prompt | Output sounds like generic AI. Hedging, filler, and false enthusiasm appear. |
| Content spec | User message (format + constraints) | Wrong word count, wrong format, wrong audience targeting. |
Structuring the API Call
The system prompt contains your voice fingerprint and any persistent instructions (forbidden words, sentence length constraints, tone markers). The user message contains the outline, research brief, and content specification.
A typical content spec includes:
- Target word count (not "about 1000 words" but "900 to 1100 words")
- Target audience (specific: "mid-career marketing managers at B2B SaaS companies," not "professionals")
- Required elements (e.g., "include at least 2 specific data points from the research brief")
- Forbidden elements (e.g., "no bullet lists, no rhetorical questions in headings, no em dashes")
- Format (markdown, HTML, plain text, with or without headings)
The more specific the spec, the less work you do in Stages 4 and 5. Every ambiguity in your input becomes a coin flip in the output.
Section-by-Section vs. Full-Document Generation
You have two approaches to draft generation, and the choice depends on content length.
Full-document generation sends the entire outline and research brief in one API call and asks for the complete draft. This works for content under 2,000 words. The AI maintains context across the whole piece, transitions are natural, and the argument flows.
Section-by-section generation sends one outline section at a time, along with the relevant research and a summary of preceding sections. This works for longer content (3,000+ words, chapters, reports). Each section gets full attention from the context window, but you need to manage transitions manually.
| Approach | Best For | Advantage | Disadvantage |
|---|---|---|---|
| Full-document | Under 2,000 words | Natural transitions, consistent voice | Context dilution with large inputs |
| Section-by-section | Over 3,000 words | Full context attention per section | Transition management, potential voice drift |
Evaluating the Draft Against Inputs
The first draft comes back. Before it moves to Review (Stage 4), run a quick self-check against the three primary inputs.
Outline compliance: Does the draft follow your outline? Are sections in the right order? Did the AI skip any sections or add sections you did not request? AI models sometimes "helpfully" add introductions and conclusions you did not ask for, or reorder sections to match their default structure.
Research usage: Does the draft reference the sources and data points from your research brief? Or did it ignore your research and generate from training data instead? Search for specific numbers and facts from your brief in the output. If they are missing, the AI bypassed your research.
Voice consistency: Does the output match the voice fingerprint? Read the first three paragraphs aloud. If you stumble on phrasing that does not sound like the voice you specified, mark it. Voice breaks are the most common failure mode in constrained generation.
The quality gate for Stage 3: the draft follows the outline, uses the research, and approximates the voice fingerprint. It does not need to be perfect. It needs to be a solid foundation for human review and editing.
Temperature Settings for Drafting
For production drafting, temperature between 0.3 and 0.5 is the typical sweet spot. Lower than 0.3 produces rigid, repetitive prose. Higher than 0.7 introduces unpredictability that creates more editing work than it saves in creativity. The optimal temperature varies by content type, and you should have found your preferred setting during Module 5.
One parameter people overlook: max tokens. Set it slightly above your target word count (tokens are roughly 0.75 words in English). This prevents truncation without inviting bloat.
Further Reading
- Prompt Engineering Overview, Anthropic Documentation
- AI Agent Content Writing System: Complete Guide, Sight AI
- Content Creation Workflow, Activepieces
Assignment
Generate a first draft using everything you have built so far:
- Research brief from Session 8.2 as context
- Outline from Session 8.3 as structural instructions
- Voice fingerprint from Module 6 as your system prompt
- A content specification with word count, audience, required elements, and forbidden elements
Evaluate the output against all three inputs. Mark every deviation:
- Sections that do not match the outline
- Claims not supported by the research brief
- Sentences that break the voice fingerprint
Count the deviations. This number is your baseline for drafting quality. You will improve it by refining your inputs.