Playwright (1.x) rules for Cursor. Teaches semantic locators (getByRole/getByLabel over CSS), web-first assertions (toBeVisible over isVisible), the test.extend fixture model, storageState + setup-project auth, page.route mocking with proper continue/fallback, the Playwright POM-via-fixture pattern, ARIA snapshot a11y, sharded CI with merge-reports, and the macOS-vs-Linux baseline trap for visual regression. Catches 34 LLM regressions: page.click('text='), page.$ / page.$$ ElementHandles, waitForTimeout, isVisible booleanised, missing await on assertions, page.route after goto, hardcoded credentials, deprecated playwright-github-action, and more.
Scaffold a new Playwright test the modern way: semantic locators (getByRole over CSS), web-first assertions (await expect(loc).toBeVisible()), test.extend fixtures (not beforeEach), POM-via-fixture for repeated screens, async-arrow signature with destructured fixtures, and the canonical test file layout.
# Scaffold a New Playwright Test
## When to Use
Use when generating a new `*.spec.ts` file or adding a new test to an existing spec. The output should be CI-grade from line one - no `page.click("text=...")`, no `waitForTimeout`, no `beforeEach` for what should be a fixture.
## Output
For a screen with no shared setup:
```typescript
// tests/checkout.spec.ts
import { test, expect } from "@playwright/test";
test.describe("checkout flow", () => {
test("user can place an order", async ({ page }) => {
await page.goto("/products/widget");
await page.getByRole("button", { name: "Add to cart" }).click();
await page.getByRole("link", { name: "Cart" }).click();
await page.getByRole("button", { name: "Checkout" }).click();
// Card number from env (every PSP has its own test cards). Expiry/CVC
// are non-secret test-fixture values - inline literals are fine.
await page.getByLabel("Card number").fill(process.env.E2E_TEST_CARD!);
await page.getByLabel("Expiry").fill("12/30");
await page.getByLabel("CVC").fill("123");
await page.getByRole("button", { name: "Pay" }).click();
await expect(page.getByRole("heading", { name: "Thank you" })).toBeVisible();
await expect(page).toHaveURL(/\/orders\/\w+/);
});
});
```
For a screen with repeated setup, use a fixture:
```typescript
// tests/fixtures.ts
import { test as base, expect } from "@playwright/test";
import { CheckoutPage } from "./pages/checkout.page";
type Fixtures = {
checkoutPage: CheckoutPage;
};
export const test = base.extend<Fixtures>({
checkoutPage: async ({ page }, use) => {
const checkoutPage = new CheckoutPage(page);
await checkoutPage.goto();
await use(checkoutPage);
},
});
export { expect };
```
```typescript
// tests/pages/checkout.page.ts
import type { Page, Locator } from "@playwright/test";
export class CheckoutPage {
readonly cardNumber: Locator;
readonly expiry: Locator;
readonly cvc: Locator;
readonly payButton: Locator;
constructor(private readonly page: Page) {
this.cardNumber = page.getByLabel("Card number");
this.expiry = page.getByLabel("Expiry");
this.cvc = page.getByLabel("CVC");
this.payButton = page.getByRole("button", { name: "Pay" });
}
async goto() {
await this.page.goto("/checkout");
}
async pay(card: string, expiry: string, cvc: string) {
await this.cardNumber.fill(card);
await this.expiry.fill(expiry);
await this.cvc.fill(cvc);
await this.payButton.click();
}
}
```
```typescript
// tests/checkout.spec.ts
import { test, expect } from "./fixtures";
test("user can pay", async ({ checkoutPage, page }) => {
await checkoutPage.pay(process.env.E2E_TEST_CARD!, "12/30", "123");
await expect(page.getByRole("heading", { name: "Thank you" })).toBeVisible();
});
```
## Rules baked into the scaffold
1. **Imports**: `test`, `expect`, `Page`, `Locator` from `@playwright/test` only. Never from `playwright` or `playwright-core`.
2. **Locators**: `getByRole` first; `getByLabel`/`getByText`/`getByPlaceholder` next; `getByTestId` last. No CSS, no XPath, no `text=`.
3. **Assertions**: `await expect(locator).toBe...()`. Never `expect(await locator.isVisible()).toBeTruthy()`. Always `await` the assertion.
4. **No timing helpers**: no `page.waitForTimeout`, no `page.waitForSelector` followed by an action. Locator actions auto-wait.
5. **Fixtures over `beforeEach`**: if you'd write `beforeEach` to set up a value, write a fixture instead.
6. **POM as locator getters**: page objects expose `Locator` properties, not `ElementHandle`, and never call `expect` themselves.
7. **Secrets via env**: `process.env.E2E_USER!` not hardcoded strings.
8. **No `test.only`**: ever. `forbidOnly: !!process.env.CI` in config will fail CI.
9. **`test.describe` for grouping** when the file has more than one logical scenario; otherwise top-level `test()` is fine.
## Workflow
1. Identify the screen / flow under test.
2. Decide: standalone test, or POM-backed fixture? Use POM when the same locators appear in 2+ tests.
3. Pick semantic locators based on the rendered DOM (use `npx playwright codegen <url>` to bootstrap, then refactor away the `getByText` overuse).
4. Add web-first assertions at every meaningful state transition.
5. Run `npx playwright test --ui` locally before committing.
6. Confirm `npx playwright test --reporter=list path/to/spec.ts` passes headlessly.
## Common mistakes to refuse
- A `beforeEach` that constructs a POM and stores it in module-level state. Use a fixture.
- An `await page.waitForTimeout(N)` "just to be safe." Replace with the assertion that justifies the wait.
- A `try/catch` around an assertion to "make tests more robust." Web-first assertions handle their own retry. Catching the failure hides bugs.
- A locator like `page.locator("body > div:nth-child(3) > .foo")`. Refactor to a semantic locator.
- A hardcoded `"alice@example.com"` and `"hunter2"` in the test body. Move to env vars.Set up Playwright authentication via the setup-project + storageState pattern. Covers single shared session (read-only tests), per-worker storage state (mutating tests), multi-role projects (admin/user/guest), and JWT bypass tokens for the fastest deterministic auth.
# Set up Playwright Authentication
## When to Use
When the application under test requires login, and tests beyond a smoke check need an authenticated session. The naive approach is to log in via the UI in `beforeEach` - this is slow and flaky. Use one of the four patterns below depending on what your tests do.
## Decision tree
- **Read-only tests, single user is fine** → Pattern 1 (single shared `storageState`).
- **Tests that mutate user state** → Pattern 2 (per-worker `storageState`).
- **Tests across multiple roles (admin/user/guest)** → Pattern 3 (multi-role projects).
- **Want the fastest, most deterministic auth and the app can mint test JWTs** → Pattern 4 (JWT bypass).
## Pattern 1: Single shared storageState
### Step 1: write the setup file
```typescript
// tests/auth.setup.ts
import { test as setup, expect } from "@playwright/test";
import * as fs from "node:fs";
import * as path from "node:path";
const authFile = "playwright/.auth/user.json";
setup("authenticate", async ({ page }) => {
fs.mkdirSync(path.dirname(authFile), { recursive: true });
await page.goto("/login");
await page.getByLabel("Email").fill(process.env.E2E_USER!);
await page.getByLabel("Password").fill(process.env.E2E_PASS!);
await page.getByRole("button", { name: "Sign in" }).click();
// Wait for a real post-login indicator BEFORE saving state.
await expect(page.getByRole("heading", { name: "Dashboard" })).toBeVisible();
await page.context().storageState({ path: authFile });
});
```
### Step 2: wire into config
```typescript
// playwright.config.ts (excerpt)
import { defineConfig, devices } from "@playwright/test";
export default defineConfig({
projects: [
{ name: "setup", testMatch: /.*\.setup\.ts/ },
{
name: "chromium",
use: {
...devices["Desktop Chrome"],
storageState: "playwright/.auth/user.json",
},
dependencies: ["setup"],
},
],
});
```
### Step 3: gitignore the auth dir
```gitignore
# .gitignore
playwright/.auth/
```
### Step 4: set env vars
Local: `.env` (also gitignored), loaded via `dotenv` at the top of `playwright.config.ts`. CI: repo secrets.
## Pattern 2: Per-worker storageState
Use when tests create/update/delete data tied to the logged-in user. Sharing one user across workers causes data races.
```typescript
// tests/fixtures.ts
import { test as base, expect } from "@playwright/test";
import * as fs from "node:fs";
import * as path from "node:path";
type WorkerFixtures = { workerStorageState: string };
export const test = base.extend<{}, WorkerFixtures>({
// Override the built-in storageState with the worker-scoped one.
storageState: ({ workerStorageState }, use) => use(workerStorageState),
workerStorageState: [
async ({ browser }, use, workerInfo) => {
const file = path.resolve(
`playwright/.auth/${workerInfo.workerIndex}.json`,
);
if (!fs.existsSync(file)) {
fs.mkdirSync(path.dirname(file), { recursive: true });
const ctx = await browser.newContext({ storageState: undefined });
const page = await ctx.newPage();
const email = `e2e-worker-${workerInfo.workerIndex}@example.com`;
const password = process.env.E2E_PASS!;
await page.goto("/login");
await page.getByLabel("Email").fill(email);
await page.getByLabel("Password").fill(password);
await page.getByRole("button", { name: "Sign in" }).click();
await expect(
page.getByRole("heading", { name: "Dashboard" }),
).toBeVisible();
await ctx.storageState({ path: file });
await ctx.close();
}
await use(file);
},
{ scope: "worker" },
],
});
export { expect };
```
The pre-existing accounts (`e2e-worker-0@example.com`, `e2e-worker-1@example.com`, ...) must be seeded in your test database. A migration or a seed script before the test run is the simplest approach.
## Pattern 3: Multi-role projects
```typescript
// tests/auth.setup.ts
import { test as setup, expect } from "@playwright/test";
import * as fs from "node:fs";
import * as path from "node:path";
async function authenticate(
page: import("@playwright/test").Page,
email: string,
password: string,
postLoginHeading: string,
file: string,
) {
fs.mkdirSync(path.dirname(file), { recursive: true });
await page.goto("/login");
await page.getByLabel("Email").fill(email);
await page.getByLabel("Password").fill(password);
await page.getByRole("button", { name: "Sign in" }).click();
// Always scope to a known post-login heading. A bare getByRole("heading")
// matches the /login page's own heading and saves pre-login cookies.
await expect(page.getByRole("heading", { name: postLoginHeading })).toBeVisible();
await page.context().storageState({ path: file });
}
setup("authenticate as admin", async ({ page }) => {
await authenticate(
page,
process.env.E2E_ADMIN_USER!,
process.env.E2E_ADMIN_PASS!,
"Admin dashboard",
"playwright/.auth/admin.json",
);
});
setup("authenticate as user", async ({ page }) => {
await authenticate(
page,
process.env.E2E_USER!,
process.env.E2E_PASS!,
"Dashboard",
"playwright/.auth/user.json",
);
});
```
```typescript
// playwright.config.ts (excerpt)
projects: [
{ name: "setup", testMatch: /.*\.setup\.ts/ },
{
name: "chromium-admin",
testMatch: /.*\.admin\.spec\.ts/,
use: { ...devices["Desktop Chrome"], storageState: "playwright/.auth/admin.json" },
dependencies: ["setup"],
},
{
name: "chromium-user",
testMatch: /.*\.user\.spec\.ts/,
use: { ...devices["Desktop Chrome"], storageState: "playwright/.auth/user.json" },
dependencies: ["setup"],
},
];
```
For one-off role flips, override per-test instead:
```typescript
test.use({ storageState: "playwright/.auth/admin.json" });
test("admin can delete users", async ({ page }) => {
/* ... */
});
```
## Pattern 4: JWT bypass tokens
The fastest auth pattern - no UI, no flake. Requires app cooperation.
### App side (Node/TS example)
```typescript
// In the app, only mounted when E2E_TOKEN_SECRET is set.
if (process.env.E2E_TOKEN_SECRET) {
app.post("/test/token", (req, res) => {
if (req.headers["x-e2e-secret"] !== process.env.E2E_TOKEN_SECRET) {
return res.status(403).end();
}
const token = signSessionToken({ sub: req.body.sub, role: req.body.role });
res.json({ token });
});
}
```
### Test side
```typescript
// tests/fixtures.ts
import { test as base, expect, type Page } from "@playwright/test";
type Fixtures = { authedPage: Page };
export const test = base.extend<Fixtures>({
authedPage: async ({ browser }, use) => {
const ctx = await browser.newContext();
const res = await ctx.request.post("/test/token", {
data: { sub: "alice", role: "admin" },
headers: { "x-e2e-secret": process.env.E2E_TOKEN_SECRET! },
});
const { token } = await res.json();
await ctx.addCookies([
{
name: "session",
value: token,
url: process.env.E2E_BASE_URL!,
httpOnly: true,
secure: true,
sameSite: "Lax",
},
]);
const page = await ctx.newPage();
await use(page);
await ctx.close();
},
});
export { expect };
```
Constraints:
- The `/test/token` endpoint must be gated by an env var that is unset in production.
- It must refuse without the matching secret header.
- It must sign with the same key as the real auth flow.
## Common mistakes
- **Saving `storageState` before login completes** - file has only the pre-login cookies. Always assert a post-login indicator before saving.
- **Setup project failing silently** - dependents are skipped (not failed), and if your reporter only shows non-skipped, you see green CI with no tests run. Always inspect the setup project's report when downstream tests are mass-skipped.
- **Per-worker auth file collision** - two workers writing to the same path. Use `workerInfo.workerIndex` in the file path.
- **`storageState: "..."` pointing at a non-existent file** - tests run unauthenticated. Add `dependencies: ["setup"]` so the file is guaranteed to exist.
- **Committing `playwright/.auth/`** - leaks live session cookies. Always gitignore.
- **Logging the password** - it ends up in `trace.zip` and HTML reports. Never `console.log(process.env.E2E_PASS)`.Scan a Playwright codebase for tracked anti-patterns. Most are reliably grep-detectable: text/CSS engine selectors, page.$/$$ ElementHandles, page.waitForTimeout, isVisible/textContent booleanised, page.route after goto, hardcoded credentials, missing forbidOnly, deprecated playwright-github-action, headless: false committed, mode 'serial' default. A few (assertion missing await, untyped test.extend) need manual review.
# Validate a Playwright Codebase
## When to Use
When auditing an existing Playwright suite, when reviewing a PR that touches `*.spec.ts` files, or before promoting a suite from "smoke" to "blocking CI." The grep patterns below catch the regressions documented in the `playwright-anti-patterns` rule.
## Scope
Run from the repo root. The patterns assume tests live under `tests/` or `e2e/`. Adjust the path if your project differs.
```bash
TEST_DIR=tests # or e2e, or wherever your *.spec.ts live
```
## CRITICAL: hardcoded credentials in test files
```bash
# Heuristic - email-like or password-like string literals on any fill() line.
# The two-step pipe handles both forms:
# page.fill("#email", "alice@acme.com") // email is the SECOND arg
# page.getByLabel("Email").fill("alice@acme.com") // email is the FIRST arg
grep -rnE '(\.fill\(|page\.fill\()' --include='*.ts' "$TEST_DIR" | grep -E '"[^"]+@[^"]+\.[^"]+"'
grep -rnE '(\.fill\(|page\.fill\()' --include='*.ts' "$TEST_DIR" | grep -iE '"(p[a@]ssw[o0]rd|hunter2|admin|secret)"'
```
Manual triage. The password grep will false-positive on `getByLabel("Password").fill(process.env.E2E_PASS!)` because `"Password"` matches the pattern and `.fill(` is on the same line. Ignore matches where the `.fill(` argument is a `process.env.*` reference. The email grep does not have this problem in practice. If the email literal is a fixture domain (`@example.com`), it may be intentional - but the password should still come from env.
## CRITICAL: missing await on web-first assertion
```bash
# Lines starting with `expect(` (no await) followed by a chained matcher.
# This is a heuristic - manual triage required because `expect(value).toBe(...)`
# on synchronous values is legitimate.
grep -rnE '^\s*expect\(.*\)\.(toBeVisible|toBeHidden|toHaveText|toHaveURL|toHaveTitle|toContainText|toHaveCount|toBeAttached|toBeEnabled|toBeDisabled|toBeFocused)' --include='*.ts' "$TEST_DIR"
```
Every line returned should have an `await` immediately before `expect(`. Anything without `await` is a no-op.
## ERROR: `page.waitForTimeout`
```bash
grep -rnE '\bpage\.waitForTimeout\(' --include='*.ts' "$TEST_DIR"
```
Every hit is wrong in a test body. Replace with the web-first assertion that justifies the wait.
## ERROR: text/CSS/XPath engine selectors
```bash
# `text=`, `css=`, `xpath=` engine selectors in click/fill/locator calls.
# The character class covers double-quote, single-quote (\x27), and backtick.
grep -rnE "(click|fill|locator)\((\"|\x27|\`)(text=|css=|xpath=)" --include='*.ts' "$TEST_DIR"
```
Replace with `page.getByRole / getByLabel / getByText / getByPlaceholder`.
## ERROR: `page.$()` / `page.$$()` ElementHandles
```bash
grep -rnE '\bpage\.\$\$?\(' --include='*.ts' "$TEST_DIR"
```
Both return `ElementHandle` (snapshot, racy). Replace with `page.locator(...)` or, better, a `getBy*` call.
## ERROR: `expect(await locator.isVisible()).toBeTruthy()`
```bash
grep -rnE 'expect\(await\s+\S+\.(isVisible|isHidden|isEnabled|isDisabled|isChecked|isEditable|textContent|innerText|inputValue)\(\)\)' --include='*.ts' "$TEST_DIR"
```
Every hit loses auto-retry. Replace with the web-first equivalent: `await expect(loc).toBeVisible()`, `toHaveText()`, etc.
## ERROR: `assert` from `node:assert`
```bash
grep -rnE 'from\s+("|\x27)node:assert("|\x27)' --include='*.ts' "$TEST_DIR"
grep -rnE 'require\(("|\x27)assert("|\x27)\)' --include='*.ts' "$TEST_DIR"
```
Use Playwright's `expect` so failures show up in trace and reporter.
## ERROR: missing `forbidOnly`
```bash
# In playwright.config.ts (or .js/.mts) - confirm forbidOnly is set.
# -r is required: BSD grep on macOS does NOT recurse without it, even with --include.
grep -rlE 'defineConfig\(' --include='playwright.config.*' . 2>/dev/null | while read -r f; do
grep -q 'forbidOnly' "$f" || echo "$f: forbidOnly missing"
done
```
`forbidOnly: !!process.env.CI` should be in the top-level `defineConfig({...})`.
## ERROR: missing `webServer` block
```bash
grep -rlE 'defineConfig\(' --include='playwright.config.*' . 2>/dev/null | while read -r f; do
grep -q 'webServer' "$f" || echo "$f: webServer missing - tests may race the dev server"
done
```
## ERROR: `Page` / `Locator` imported from `playwright` (not `@playwright/test`)
```bash
grep -rnE 'from\s+("|\x27)playwright(-core)?("|\x27)' --include='*.ts' "$TEST_DIR"
```
Use `import type { Page, Locator } from "@playwright/test"`.
## WARN: `page.waitForSelector` followed by an action
```bash
grep -rn -A 2 'page\.waitForSelector' --include='*.ts' "$TEST_DIR"
```
Manual triage. If the next line is a click/fill on the same selector, the wait is redundant.
## WARN: `page.route` after `page.goto` (race condition)
```bash
# Heuristic two-step: list files with both, then manually inspect order.
for f in $(grep -rl 'page\.route' --include='*.ts' "$TEST_DIR"); do
if grep -q 'page\.goto' "$f"; then
echo "=== $f ==="
grep -nE 'page\.(route|goto)' "$f"
fi
done
```
Manual triage. `page.route(...)` should appear before `page.goto(...)` for the relevant route.
## WARN: route handler with no `continue` or `fallback`
```bash
# Files with page.route but no continue/fallback at all
for f in $(grep -rl 'page\.route\|context\.route' --include='*.ts' "$TEST_DIR"); do
grep -qE 'route\.(continue|fallback)' "$f" || echo "$f: route handler with no continue/fallback - non-matching requests may hang"
done
```
## WARN: `headless: false` in config
```bash
grep -rnE 'headless\s*:\s*false' --include='playwright.config.*' --include='*.ts' .
```
Headless is the default. Override ad-hoc with `--headed`. Don't commit it.
## WARN: `trace: "on"` in config
```bash
grep -rnE 'trace\s*:\s*("|\x27)on("|\x27)' --include='playwright.config.*' .
```
Use `trace: "on-first-retry"` for CI. `"on"` produces large artifacts every run.
## WARN: untyped `test.extend`
```bash
# test.extend without a type parameter
grep -rnE '\bextend\(\s*\{' --include='*.ts' "$TEST_DIR" | grep -v '<'
```
Manual triage. Should be `base.extend<Fixtures>({...})`.
## WARN: `data-testid` overuse when role exists
This needs manual review - grep cannot tell whether a role-based locator would work. Spot-check `getByTestId(...)` calls; if the underlying element is a `<button>`, `<a>`, `<input>`, or has a heading role, prefer `getByRole`.
```bash
grep -rnE '\bgetByTestId\(' --include='*.ts' "$TEST_DIR"
```
## WARN: deprecated `microsoft/playwright-github-action`
```bash
grep -rnE 'microsoft/playwright-github-action' --include='*.yml' --include='*.yaml' .github/
```
Replace with raw `npx playwright install --with-deps` + `npx playwright test`.
## WARN: `test.describe.configure({ mode: 'serial' })` as default
```bash
grep -rnE "describe\.configure\(\s*\{\s*mode\s*:\s*('|\")serial" --include='*.ts' "$TEST_DIR"
```
Serial mode disables isolation and skips remaining tests on first failure. Use only when shared session is truly required - independent of any other check above.
## SUGGESTION: hardcoded viewport over `devices`
```bash
grep -rnE 'viewport\s*:\s*\{' --include='*.ts' "$TEST_DIR"
```
Manual triage. If the dimensions match a known device profile, use `...devices['iPhone 14']` or similar.
## Output format
Tag each finding with severity (CRITICAL / ERROR / WARN / SUGGESTION) and emit `file:line - one-line problem - one-line fix`. Group by file. End with `N critical, N errors, N warnings, N suggestions`.
## What this skill does NOT catch
- `data-testid` overuse where role is appropriate (manual review).
- Logic bugs in fixture lifecycle.
- POM design quality.
- Visual regression baseline OS mismatch (`__screenshots__/` exists for the wrong OS).
- Real flake from network timing.Set up Playwright visual regression with toHaveScreenshot: mask volatile regions, animations: 'disabled', maxDiffPixels, the macOS-vs-Linux baseline trap (the #1 visual-regression gotcha), the CI workflow for diff review, and when to reach for Argos / Chromatic / Percy instead of the built-in.
# Set up Playwright Visual Regression
## When to Use
When you want to catch unintended UI changes - layout shifts, color regressions, missing elements - that pass functional tests. Visual regression is a complement to web-first assertions, not a replacement.
## Decision tree
- **Single-browser project, small surface, can pin CI to one OS** → Built-in `toHaveScreenshot()` (this skill).
- **PR-attached visual diff approval flow, OSS-friendly** → Argos (`@argos-ci/playwright`).
- **Storybook design system** → Chromatic.
- **Need cross-OS rendering in the cloud (sidesteps macOS-vs-Linux)** → Percy.
This skill covers the built-in flow. For the third-party services, follow their docs.
## Step 1: write the visual test
```typescript
// tests/visual/home.visual.spec.ts
import { test, expect } from "@playwright/test";
test("home page", async ({ page }) => {
await page.goto("/");
// Always wait for the page to be visually stable BEFORE the screenshot.
await expect(page.getByRole("heading", { name: "Welcome" })).toBeVisible();
await expect(page).toHaveScreenshot("home.png", {
fullPage: true,
mask: [
page.locator("[data-testid='timestamp']"),
page.locator(".live-counter"),
page.locator(".user-avatar"),
],
animations: "disabled",
maxDiffPixels: 100,
});
});
```
Key options:
- **`mask`** - locators whose pixels are replaced with magenta blocks before comparison. Use for volatile regions (timestamps, counters, avatars, ads).
- **`animations: "disabled"`** - freezes CSS animations and the text-input caret. The default in modern Playwright versions; explicit is fine.
- **`maxDiffPixels`** - absolute pixel-count tolerance. Unset by default.
- **`threshold`** - 0 to 1, YIQ pixelmatch sensitivity. Default 0.2. Lower = stricter.
- **`stylePath`** - inject CSS before capture (useful to hide volatile elements without DOM-level masking).
- **`fullPage`** - capture beyond the viewport.
## Step 2: a dedicated `visual` project (load-bearing)
This is where the macOS-vs-Linux gotcha lives. **Snapshot baselines from your dev macOS will not match CI Linux.** Solve it with a dedicated project that runs only on Linux, with baselines committed only from CI artifacts.
```typescript
// playwright.config.ts (excerpt)
import { defineConfig, devices } from "@playwright/test";
export default defineConfig({
projects: [
// ... existing functional projects ...
{
name: "visual",
testMatch: /.*\.visual\.spec\.ts/,
use: { ...devices["Desktop Chrome"] },
// Only run when we explicitly opt in - avoids cross-OS baseline drift in dev.
// Two equivalent ways to express "skip unless RUN_VISUAL is set":
// testIgnore: process.env.RUN_VISUAL ? undefined : ['**'], // official option
// grep: process.env.RUN_VISUAL ? undefined : /__never__/, // sentinel pattern
testIgnore: process.env.RUN_VISUAL ? undefined : ["**"],
},
],
});
```
Locally: `RUN_VISUAL=1 npx playwright test --project=visual`. CI: set `RUN_VISUAL=1` only on Linux runners.
## Step 3: the baseline workflow
1. Developer pushes a UI change.
2. CI runs `npx playwright test --project=visual` and fails on diff.
3. Developer opens the HTML report (`playwright-report/index.html`) or the trace, reviews the diff (expected / actual / diff side-by-side).
4. If the diff is intentional, developer runs `RUN_VISUAL=1 npx playwright test --project=visual --update-snapshots` **on a Linux machine** (or downloads the CI artifact and commits the new PNG).
5. PR approver re-runs CI; baselines now match.
**Never auto-update snapshots in CI.** That defeats the purpose.
## Step 4: snapshot file layout
Snapshots go to `<testfile>-snapshots/<name>-<projectName>-<platform>.png`. Example: `tests/visual/home.visual.spec.ts-snapshots/home-visual-linux.png`.
Commit the `linux` PNGs only. Add to `.gitignore`:
```gitignore
# Reject non-Linux snapshots (commit only what CI produces)
**/*-snapshots/*-darwin.png
**/*-snapshots/*-darwin-arm64.png
**/*-snapshots/*-win32.png
```
## Step 5: CI workflow snippet
```yaml
# .github/workflows/visual.yml
name: visual
on:
pull_request:
paths:
- "src/**"
- "tests/visual/**"
- "playwright.config.ts"
jobs:
visual:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 20, cache: pnpm }
- run: pnpm install --frozen-lockfile
- run: npx playwright install --with-deps chromium
- run: RUN_VISUAL=1 npx playwright test --project=visual --reporter=html
env:
RUN_VISUAL: "1"
- if: failure()
uses: actions/upload-artifact@v4
with:
name: playwright-visual-report
path: playwright-report/
retention-days: 14
```
## Common mistakes to refuse
- **Committing macOS baselines** - they will not match CI Linux. Always regenerate from a Linux runner.
- **Forgetting `mask:` for volatile regions** - timestamps, counters, avatars cause every run to fail. Mask them out.
- **Screenshot taken before the page is stable** - flake. Always assert the page reached its final state first (`await expect(...).toBeVisible()`).
- **`maxDiffPixels: 1000`** to "stop the flake" - hides real regressions. Lower the tolerance, fix the source of variance.
- **`threshold: 1`** ditto - that's "anything passes."
- **`fullPage: true` on a long scrolling page with lazy-loaded content** - bottom of the page is unloaded when the screenshot is taken. Either scroll-and-wait first, or screenshot the visible region only.
- **Auto-update in CI** - defeats the purpose. Only update locally (or by downloading the CI artifact) after a human eyeballs the diff.
## When to reach for a third-party service
The built-in is good when:
- You have one OS (Linux CI) and one browser project for visuals.
- The team can review diffs in the HTML report.
- You don't need a PR-attached approval flow.
Reach for **Argos** when you want a polished review UI free; **Chromatic** when you have a Storybook design system; **Percy** when you need cross-OS rendering in the cloud (sidesteps the baseline-OS problem entirely).
## Reference
- [Visual comparisons](https://playwright.dev/docs/test-snapshots)
- [SnapshotAssertions API](https://playwright.dev/docs/api/class-snapshotassertions)