Session 8.7: Stage 6: Formatting and Export

Course → Module 8: The Pipeline

Session 7 of 10

Formatting Is Automation's Best Use Case

You have an approved piece of content. It passed review. It passed editing. Now it needs to exist in every format your distribution channels require. HTML for the website. PDF for downloads. EPUB for e-readers. WordPress XML for import. Markdown for archives.

This is mechanical work. It requires no creativity, no judgment, no taste. It requires exact, repeatable conversion from one format to another. That makes it the perfect job for automation.

The Single-Source Principle

The cardinal rule of formatting: one source file produces all output formats. You do not maintain separate versions for web, print, and email. You maintain one canonical version (typically Markdown or clean HTML) and convert it automatically.

flowchart TD A["Approved Content
(Markdown)"] --> B["HTML"] A --> C["PDF"] A --> D["EPUB"] A --> E["WordPress XML"] A --> F["Email HTML"] A --> G["Plain Text"] style A fill:#222221,stroke:#c8a882,color:#ede9e3 style B fill:#222221,stroke:#6b8f71,color:#ede9e3 style C fill:#222221,stroke:#8a8478,color:#ede9e3 style D fill:#222221,stroke:#c47a5a,color:#ede9e3 style E fill:#222221,stroke:#c8a882,color:#ede9e3 style F fill:#222221,stroke:#6b8f71,color:#ede9e3 style G fill:#222221,stroke:#8a8478,color:#ede9e3

If you edit the PDF version separately from the HTML version, they will diverge. Within three edits, you have two different pieces of content that are supposed to be identical. This is how errors multiply. One source, many outputs. Always.

Pandoc: The Universal Converter

Pandoc is a free, open-source document converter that handles over 40 formats. It converts Markdown to HTML, HTML to PDF (via LaTeX), Markdown to EPUB, Markdown to DOCX, and nearly every other combination. It runs from the command line, which means it can be scripted and automated.

Conversion	Command	Notes
Markdown to HTML	`pandoc input.md -o output.html`	Add `--standalone` for complete HTML with head/body
Markdown to PDF	`pandoc input.md -o output.pdf`	Requires LaTeX (install TeX Live or MiKTeX)
Markdown to EPUB	`pandoc input.md -o output.epub`	Add metadata with `--metadata title="Title"`
Markdown to DOCX	`pandoc input.md -o output.docx`	Use `--reference-doc` for branded templates
HTML to Markdown	`pandoc input.html -o output.md`	Useful for importing legacy content into your pipeline

Your AI coding assistant can write a batch conversion script in minutes. The script reads every file in your "approved" folder, converts each to all required formats, and saves the outputs in format-specific subfolders. Run it once after every editing pass.

Metadata Injection

Formatting is not just about the content body. Every output format needs metadata: titles, descriptions, author names, publication dates, keywords, and Open Graph tags for social sharing.

Store metadata in a structured file (YAML front matter in your Markdown source, or a separate JSON file per piece). Your conversion script reads the metadata and injects it into the correct location for each format:

HTML: <title>, <meta> tags, Open Graph properties
PDF: document properties (title, author, subject)
EPUB: OPF metadata (dc:title, dc:creator, dc:description)
WordPress: post title, excerpt, categories, tags

Manual metadata entry is a common source of errors. Automate it. The metadata exists in one place and propagates to all formats automatically.

Visual Consistency Across Formats

Each output format has its own rendering engine. HTML renders in browsers. PDF renders via LaTeX or a PDF engine. EPUB renders in e-reader software. The same content can look different in each format, and "different" sometimes means "broken."

Build a format test checklist:

Check	HTML	PDF	EPUB
Headings render correctly	Verify in browser	Verify in PDF reader	Verify in Calibre or e-reader
Tables are legible	Check responsive behavior	Check column widths	Tables may not render; use alternatives
Images display	Check paths	Check embedding	Check file inclusion
Links work	Click each link	Verify clickable	Verify clickable
Code blocks formatted	Check syntax highlighting	Check monospace font	Check line wrapping

Run this checklist on your first batch. Once your conversion pipeline is stable, spot-check rather than full-check. But the first time, verify everything.

The quality gate for Stage 6: all target formats generated without errors, metadata correct in every format, and visual spot-checks pass. This is the last automated stage before publishing.

Assignment

Take your finished piece from Session 8.6 and convert it to at least 3 different formats:

Install Pandoc if you have not already (pandoc.org/installing.html).
Save your approved content as a Markdown file with YAML front matter for metadata.
Convert to HTML, PDF, and one additional format of your choice.
Run the format test checklist on each output.

If you are comfortable scripting: ask your AI coding assistant to create a batch conversion script that takes a Markdown file as input and produces all three formats. One command, three outputs.

Stage 6: Formatting and Export