/methodology

How we audit.
Public criteria.

Six pillars. Concrete criteria. A scoring rubric that produces a number you can argue with. Every starter in the catalog is held against this — and every audit is published on the per-starter page.

The six pillars Spec template (.md) Public · permanent · free to apply to your own projects
/why-these-six

Six because fewer would lie.

Production-grade isn't a vibe. It's the set of disciplines that, when each is wired in by default, makes a codebase shippable on day one rather than after a six-week audit. Each pillar resolves a real failure mode that's otherwise paid by every team in every project that doesn't pre-pay it.

Six is the floor for honest coverage. Drop any one and the others can't compensate. Add a seventh and the criteria stop being load-bearing — they become decoration. The catalog earns its name only when every pillar is documented, audited, and visible to the buyer before they pay.

/pillars

The six pillars, in full.

Each pillar has its principle, its concrete criteria, and its scoring rubric. Must criteria block shipping when they fail; should criteria are strongly recommended but allow soft-fails.

i. security

Locked by default.

CSP · CSRF · CVE < 24h

A breach in a starter is a breach in every product that ships on top of it. Every defensive default is paid once in the catalog and recovered every time a subscriber ships.

Criteria
  • Content-Security-Policy must

    CSP header set by the app's Worker entry; `default-src 'self'`; no `unsafe-inline` scripts in shipping config.

  • HSTS must

    Strict-Transport-Security set with `max-age=31536000; includeSubDomains` on every response.

  • No client-side secrets must

    Build output scanned for secret-shaped tokens (`*_SECRET`, `*_KEY`, provider keys). Server-only env vars never reach the bundle.

  • Validated boundaries must

    Every API request and response is Zod-validated; no `JSON.parse` of remote data without a schema.

  • Safe rich-text rendering must

    No `dangerouslySetInnerHTML` without a sanitizer (DOMPurify or equivalent); user-supplied markdown rendered through a vetted pipeline.

  • Hardened auth cookies must

    Auth cookies set `httpOnly`, `Secure`, `SameSite=Lax` (or stricter); never readable from JS.

  • Deferred Stripe.js should

    Stripe.js (or equivalent payments SDK) is loaded only when checkout initiates, not on every page.

  • No high/critical CVEs at ship should

    Dependabot or `npm audit` reports zero high or critical advisories at release.

  • CSP violation reporting should

    CSP `report-to` directive points at a collector so violations in the wild are visible, not silent.

ii. reliability

Errors are events.

unhandled < 0.1%

Failures happen. The question is whether your users feel them and whether you can see them after the fact. Reliability is staying coherent under partial failure.

Criteria
  • Error boundaries at the route level must

    Every route is wrapped in a React error boundary; the fallback gives the user something to do.

  • Typed API errors must

    `ApiError` is a discriminated union with a known set of codes; no `throw new Error('boom')` in shipping code.

  • Retry-on-idempotent only must

    Automatic network retries are limited to idempotent verbs (`GET`, `PUT`, `DELETE`); `POST` retries require an explicit idempotency key.

  • Idempotent billing mutations must

    Every billing-side mutation carries a client-generated idempotency key; the same key returns the same result.

  • Empty / loading / error states must

    Every async path has explicit empty, loading, and error UI. No state implied by absence.

  • Timed-out external calls should

    Every fetch to an external service has a timeout and a fail-closed default; no unbounded waits.

  • DLQ-paired queues (workers) should

    If the starter ships background workers, every queue is paired with a dead-letter queue and a documented poison-message policy.

iii. observability

Things you can see.

trace 100% prop.

A bug you can't observe is a bug you can't fix. Observability is the work that turns 'something is wrong' into 'this is what's wrong and here is when it started.'

Criteria
  • Propagated request ID must

    `X-Request-Id` issued at the edge, threaded through handlers and workers, and surfaced in every log line and error.

  • Structured logs must

    Logs are JSON with a stable schema; not free-form `console.log` strings in shipping code.

  • Predeclared event taxonomy must

    PostHog (or equivalent) event names are declared in a `taxonomy.ts`; new events require a taxonomy update reviewed in the PR.

  • Errors captured with context must

    Server and client errors captured with trace ID, breadcrumbs, and user context. Sourcemaps uploaded on deploy.

  • Sampled session replay should

    Replay sampled on anonymous and first-session traffic for funnel debugging; full-capture only during active investigation windows.

  • Named alert policy should

    Documented list of conditions that page a human (e.g. queue lag, webhook signing failures, audit batch failures). Not 'we'll add alerts later.'

iv. performance

Time to interactive.

LCP < 2.0s p95

Performance is a feature your users feel before they read your copy. The catalog ships under measured budgets, not aspirational ones.

Criteria
  • LCP budget must

    Largest Contentful Paint ≤ 2.5s on simulated 4G across hot routes; ≤ 2.0s on landing surfaces.

  • TBT budget must

    Total Blocking Time < 200ms on simulated 4G.

  • Per-route bundle budget must

    Per-route JavaScript ≤ 200kb gzipped, enforced in CI by a build-time check that fails the build on regression.

  • Lighthouse Performance ≥ 90 must

    Lighthouse Performance score ≥ 90 on every hot route. Audited automatically in CI on PRs that touch app code.

  • React Compiler enabled should

    React Compiler is on; the codebase does not pre-optimize with manual `useMemo` / `useCallback` for memoization defaults.

  • Route-level code splitting should

    Routes are lazy-loaded with skeleton fallbacks. No route hauls in another route's bundle.

  • Optimized raster images should

    Raster images served via a responsive image pipeline (Cloudflare Images, Imgix, or equivalent).

v. accessibility

For everyone.

axe issues = 0

Inaccessible software excludes users by accident. WCAG 2.2 AA is the floor; the rest is taste, not optional.

Criteria
  • axe-clean in CI must

    axe-core reports zero issues on every shipped route; CI fails on regression.

  • WCAG 2.2 AA contrast must

    Body text ≥ 4.5:1; large text ≥ 3:1. Verified on every surface before ship, including focus and hover states.

  • Keyboard reachability must

    Every interactive element reachable via Tab in DOM order; no `tabindex` hacks.

  • Visible focus indicator must

    1px page-color gap + 3px accent ring on every focused control. Never `outline: none` without a replacement.

  • Modal focus trap must

    Modals trap focus on entry; Esc closes; focus returns to the opener on close.

  • Status with text labels must

    Status badges, errors, and signals always pair with text or an aria-label. Color alone is never a signal.

  • prefers-reduced-motion respected must

    Pulse, marquee, fade-up, and hover transforms disabled under `prefers-reduced-motion: reduce`. Static color and state changes still work.

  • 44 × 44 touch targets must

    Interactive elements ≥ 44 × 44 CSS pixels on touch viewports.

vi. testability

Tests that mean it.

boundary cov > 90%

Tests aren't a coverage number — they're the artifact that lets a subscriber change the code with confidence. Boundary tests beat exhaustive line coverage.

Criteria
  • Vitest at every package must

    Every package, app, and worker ships unit tests adjacent to the code they cover.

  • Playwright on critical paths must

    Playwright covers the happy path and the conversion-critical error paths. Auto-starts dev servers with MSW on.

  • MSW shared with dev must

    The same MSW handlers run in Vitest (in-memory db), Playwright (auto-started servers), and `pnpm dev` (JSON-persistent db).

  • Strict TypeScript must

    TS strict mode on; no `any` without an inline justification; no `@ts-ignore` without a linked issue.

  • Boundary coverage threshold must

    Coverage thresholds set on package exports (boundaries), not per-line. CI fails on regression, not on aspirational targets.

  • Producer ↔ consumer integration tests should

    For starters that ship background workers, the queue producer/consumer contract is tested against synthesized batches.

/scoring

How a number gets attached.

Audits produce a per-pillar score from 0–100 and a verdict per criterion. The numbers are arguable, public, and dated.

Per-criterion verdict
  • Pass The criterion is met. Evidence linkable from the audit.
  • Soft-fail Partial coverage; known gap with a documented plan. Counts as half-credit toward the pillar score.
  • Fail The criterion is not met. Counts as zero. A Fail on a must criterion blocks shipping.
Pillar score formula
Pillar score = (count(Pass) + 0.5 × count(Soft-fail)) / total criteria × 100

A starter ships at the shipping status badge only when every pillar's score clears 90 and no must criterion has failed.

Ship bar
  • Every pillar scores ≥ 90.
  • No `must` criterion is Fail.
  • Soft-fail on a `must` criterion is tolerable; Fail blocks shipping regardless of overall score.
  • Audits are timestamped and re-run on every shipping release.
When an audit re-runs
  • On every shipping release of the starter.
  • On a security advisory affecting any dependency.
  • On a major framework upgrade.
  • Out-of-band when a subscriber reports a regression.
/spec-template

The spec is the template.

Every starter ships with the same plans/<slug>.md contract used internally to build it — public before download, identical in shape across the catalog. The same template is downloadable here, free to use on any project.

What's in it
  • One-liner
  • The problem
  • The product (in box / not in box)
  • Who it's for
  • Stack (base + per-starter)
  • Architecture decisions
  • Data model
  • External dependencies
  • Six-pillar audit
  • What you'll still need to build
  • Open questions
  • Roadmap
Download

A plain markdown file. Adapt it for your project, your team, your domain.

spec-template.md
Citing convention

The methodology is permanently public. If you adopt the template or the pillars in your own work, the convention is a footer link back to forstarters.io/methodology so other engineers can find the source.

*Adapted from the ForStarters.io specification template — forstarters.io/methodology.*
/live-audits

Live audits, dated and public.

Every shipping starter publishes its audit on the per-starter page. Open one, read the per-pillar scores, click through to the criteria they came from.

Audits go live as starters ship.

Pillars and criteria are public now; per-starter scores publish on each starter's first shipping release. The per-starter pages below currently show "—" for pillars under audit.

/closing

Methodology over individual starters.

The catalog is the canonical implementation of this methodology. The methodology is what makes the catalog the catalog — not the inverse. Subscriptions buy you the implementations; the rubric itself is yours to use.