← All comparisons
vs LLM + Puppeteer wrappers

LLM + Puppeteer wrapper vs makesPDF — render structure, not HTML

Don't ask the model to print HTML. Ask it to render structure.

For: Teams piping LLM output into a headless browser to generate PDFs.

LLM + Puppeteer wrappers

The one-line difference

The LLM-plus-Puppeteer pattern goes: prompt a model to emit HTML + CSS, hand the string to headless Chromium, capture the PDF. makesPDF is the agent-native alternative: the model emits markdown or a 30-token DSL string, our hosted API renders a deterministic, tagged, archival PDF. No browser in the loop, no CSS roulette, no flaky output between model versions.

Feature matrix

LLM + Puppeteer wrapper makesPDF
What the LLM emits A long HTML+CSS string Markdown or a compact DSL (doc(page(...)))
Token cost per render High (full HTML scaffolding every call) Low (DSL is ~75–95% fewer tokens than JSON or HTML)
Determinism None — same prompt → different layouts Byte-stable for a given input
Tagged PDF (PDF/UA-1) No, unless the LLM gets it right by accident Yes, by construction
PDF/A archival No Yes (PDF/A-2A)
Validation surface "Looks right in the screenshot" veraPDF + catalog validator + structure tree
Cold start / runtime Browser launch each render <100ms Worker
Agent integration Custom glue per project Public skill file + MCP server + x402-payable
Failure mode LLM emits broken CSS → silent layout drift DSL parse error → 400 with the line number

Same input, two outputs

A "Hello, Ada" report generated by an agent.

LLM + Puppeteer — model emits HTML, browser prints:

const html = await llm.generate({
  prompt:
    "Generate a one-page A4 PDF report titled 'Hello, Ada' with the body 'Welcome to the report.'. Return only the HTML.",
});
// html ≈ 800–2000 tokens of <!doctype html><html>… on every call

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(html);
await page.pdf({ path: "hello.pdf", format: "A4" });
await browser.close();

makesPDF — model emits a tiny DSL string, we render it:

const dsl = await llm.generate({
  // Model has been given the public skill file at
  // https://makespdf.com/skills/pdf-template-author.md
  prompt:
    "Emit a makesPDF DSL template for a report titled 'Hello, {{name}}' with body 'Welcome to the report.'",
});
// dsl ≈ 30–60 tokens: `const template = doc(page(col(h1("Hello, {{name}}"), p("Welcome..."))));`

await fetch("https://makespdf.com/api/v1/preview", {
  method: "POST",
  headers: { Authorization: `Bearer ${MAKESPDF_API_KEY}`, "Content-Type": "application/json" },
  body: JSON.stringify({ dsl, data: { name: "Ada" } }),
});

The DSL form gives the model a typed surface with a small, well-documented vocabulary. The HTML form gives it a 1996-era serialization format with thousands of CSS edge cases — and asks the model to get them all right, every call, with no validator in the loop.

Why people switch

  • HTML is the wrong serialization for an agent. It's verbose, full of legacy quirks, and gives the model a thousand ways to be subtly wrong (CSS specificity, box-sizing, font fallbacks, page-break rules). The DSL is a few dozen functions with explicit semantics. The model gets it right more often, in fewer tokens.
  • Tokens are cost. A 1500-token HTML scaffold per render is real money at scale. The DSL pays the structural cost once (in the skill file, which is cached) and emits ~30–100 tokens per actual document.
  • Determinism is a feature, not a luxury. When the same prompt produces different layouts on Tuesday and Thursday, you can't ship that to customers. The DSL renders to the same bytes every time.
  • Compliance can't be vibes-tested. PDF/UA-1 is a structure-tree spec. An LLM can emit HTML that looks accessible in a screenshot and still fails veraPDF. We tag by construction and validate against the ISO reference validator.
  • The skill is portable. Ship https://makespdf.com/skills/pdf-template-author.md to any agent — Claude, ChatGPT, Cursor, an MCP-aware framework — and it knows the DSL. No per-project prompt engineering.

When the LLM + Puppeteer pattern is still the right call

  • You genuinely need the full power of a browser in the loop — JS-driven charts, web fonts loaded via CSS, an existing HTML page you're snapshotting.
  • The PDF is a one-off, not a product surface. A throwaway script doesn't need a hosted API.
  • You're already standardised on Puppeteer for screenshots and don't want a second dependency.

For agent-driven document workloads — invoices, receipts, reports, decks generated on demand — the structured-DSL path is faster, cheaper, more compliant, and far easier to debug.

Migration

If you have an existing LLM-plus-Puppeteer pipeline, the migration is mostly a prompt swap: feed your agent the skill file, change the system prompt from "emit HTML" to "emit a makesPDF DSL template," and replace the puppeteer.launch() block with a fetch to /api/v1/preview. The agent guide at /docs/ai-setup walks through the wiring.