Files
Skills/browser/SKILL.md
T

398 lines
13 KiB
Markdown
Raw Normal View History

---
2026-05-01 15:55:53 +02:00
name: browser
description: Use cmux browser automation for browser access. Validate real user-facing behavior in browser surfaces, including navigation, DOM interaction, inspection, storage, tabs, dialogs, frames, downloads, and browser logs.
---
2026-05-01 15:55:53 +02:00
# Browser automation with cmux
Use this skill for browser-driven validation through `cmux browser`.
2026-05-01 15:55:53 +02:00
## Goals
- Verify real user-facing behavior.
- Access and inspect web pages.
## When to use
Use this skill when the task involves any of these:
- Opening or navigating a page
- Testing a UI flow end to end
- Filling forms or clicking controls
- Verifying visible text, URL, title, values, attributes, or counts
- Capturing snapshots or screenshots
- Checking console logs or browser errors
- Working with cookies, storage, or saved browser state
- Handling tabs, dialogs, iframes, or downloads
Do not use this skill for unit tests, static analysis, or API-only checks unless browser behavior is part of the task.
## Operating rules
1. Start by identifying or opening the browser surface you will use.
2. Wait for a stable state before interacting.
3. Prefer stable selectors and structured getters over visual guesswork.
4. After each mutating action, verify the result with URL, text, visibility, or value checks.
5. Treat `console` and `errors` output as evidence, not optional noise.
6. On failure, collect artifacts before concluding root cause.
7. Use `eval` only when the browser commands cannot express the check directly.
8. Keep credentials and secrets out of logs, commands, screenshots, and saved state when possible.
## Verified command surface
The commands below are verified against the official cmux browser automation docs.
### Targeting a browser surface
```bash
cmux browser open https://example.com
cmux browser open-split https://example.com
cmux browser identify
cmux browser identify --surface surface:2
cmux browser surface:2 url
cmux browser --surface surface:2 url
```
### Navigation and focus
```bash
cmux browser surface:2 navigate https://example.com/docs --snapshot-after
cmux browser surface:2 back
cmux browser surface:2 forward
cmux browser surface:2 reload --snapshot-after
cmux browser surface:2 focus-webview
cmux browser surface:2 is-webview-focused
```
### Waiting
```bash
cmux browser surface:2 wait --load-state complete --timeout-ms 15000
cmux browser surface:2 wait --selector "#checkout" --timeout-ms 10000
cmux browser surface:2 wait --text "Order confirmed"
cmux browser surface:2 wait --url-contains "/dashboard"
cmux browser surface:2 wait --function "window.__appReady === true"
```
Prefer this order of readiness checks:
1. `--selector` when a specific element gates the next action
2. `--text` for user-visible confirmation
3. `--url-contains` for navigation assertions
4. `--function` only when the app exposes a reliable readiness flag
5. `--load-state complete` for initial page load or simple pages
Avoid fixed sleeps unless there is no reliable signal.
### DOM interaction
```bash
cmux browser surface:2 click "button[type='submit']" --snapshot-after
cmux browser surface:2 dblclick ".item-row"
cmux browser surface:2 hover "#menu"
cmux browser surface:2 focus "#email"
cmux browser surface:2 check "#terms"
cmux browser surface:2 uncheck "#newsletter"
cmux browser surface:2 scroll-into-view "#pricing"
cmux browser surface:2 type "#search" "cmux"
cmux browser surface:2 fill "#email" --text "ops@example.com"
cmux browser surface:2 press Enter
cmux browser surface:2 keydown Shift
cmux browser surface:2 keyup Shift
cmux browser surface:2 select "#region" "us-east"
cmux browser surface:2 scroll --dy 800 --snapshot-after
cmux browser surface:2 scroll --selector "#log-view" --dx 0 --dy 400
```
Prefer `fill` when setting a known final value.
Prefer `type` when testing keystroke-driven behavior such as debouncing, masking, shortcuts, or suggestions.
### Inspection and assertions
```bash
cmux browser surface:2 snapshot --interactive --compact
cmux browser surface:2 snapshot --selector "main" --max-depth 5
cmux browser surface:2 screenshot --out /tmp/cmux-page.png
cmux browser surface:2 get title
cmux browser surface:2 get url
cmux browser surface:2 get text "h1"
cmux browser surface:2 get html "main"
cmux browser surface:2 get value "#email"
cmux browser surface:2 get attr "a.primary" --attr href
cmux browser surface:2 get count ".row"
cmux browser surface:2 get box "#checkout"
cmux browser surface:2 get styles "#total" --property color
cmux browser surface:2 is visible "#checkout"
cmux browser surface:2 is enabled "button[type='submit']"
cmux browser surface:2 is checked "#terms"
cmux browser surface:2 find role button --name "Continue"
cmux browser surface:2 find text "Order confirmed"
cmux browser surface:2 find label "Email"
cmux browser surface:2 find placeholder "Search"
cmux browser surface:2 find alt "Product image"
cmux browser surface:2 find title "Open settings"
cmux browser surface:2 find testid "save-btn"
cmux browser surface:2 find first ".row"
cmux browser surface:2 find last ".row"
cmux browser surface:2 find nth 2 ".row"
cmux browser surface:2 highlight "#checkout"
```
Preferred selector order:
1. Stable test IDs
2. Accessible role and name
3. Labels and placeholders
4. Semantic CSS selectors
5. Text selectors
Avoid brittle selectors like deeply nested `nth-child` chains unless no better option exists.
### JavaScript and injection
```bash
cmux browser surface:2 eval "document.title"
cmux browser surface:2 eval --script "window.location.href"
cmux browser surface:2 addinitscript "window.__cmuxReady = true;"
cmux browser surface:2 addscript "document.querySelector('#name')?.focus()"
cmux browser surface:2 addstyle "#debug-banner { display: none !important; }"
```
Use `eval` sparingly. Do not bypass the UI path if the task is to validate user behavior.
### State and session data
```bash
cmux browser surface:2 cookies get
cmux browser surface:2 cookies get --name session_id
cmux browser surface:2 cookies set session_id abc123 --domain example.com --path /
cmux browser surface:2 cookies clear --name session_id
cmux browser surface:2 cookies clear --all
cmux browser surface:2 storage local set theme dark
cmux browser surface:2 storage local get theme
cmux browser surface:2 storage local clear
cmux browser surface:2 storage session set flow onboarding
cmux browser surface:2 storage session get flow
cmux browser surface:2 state save /tmp/cmux-browser-state.json
cmux browser surface:2 state load /tmp/cmux-browser-state.json
```
Use saved browser state for authenticated sessions, long setup flows, and repeatable bug repros.
Do not use saved state when validating fresh-session behavior like login, onboarding, logout, or first-run UX.
### Tabs, logs, dialogs, frames, downloads
```bash
cmux browser surface:2 tab list
cmux browser surface:2 tab new https://example.com/pricing
cmux browser surface:2 tab switch 1
cmux browser surface:2 tab switch surface:7
cmux browser surface:2 tab close
cmux browser surface:2 tab close surface:7
cmux browser surface:2 console list
cmux browser surface:2 console clear
cmux browser surface:2 errors list
cmux browser surface:2 errors clear
cmux browser surface:2 dialog accept
cmux browser surface:2 dialog accept "Confirmed by automation"
cmux browser surface:2 dialog dismiss
cmux browser surface:2 frame "iframe[name='checkout']"
cmux browser surface:2 frame main
cmux browser surface:2 download --path /tmp/report.csv --timeout-ms 30000
```
## Execution playbooks
### Default flow
Use this decision sequence unless the task clearly needs a different one.
1. Acquire a surface.
```bash
cmux browser open <URL>
cmux browser identify
```
2. Establish readiness.
```bash
cmux browser surface:<ID> wait --load-state complete --timeout-ms 15000
cmux browser surface:<ID> snapshot --interactive --compact
```
3. Choose the next action type.
- Navigation task: use `navigate`, `back`, `forward`, or `reload`, then wait again.
- Form task: use the form playbook below.
- Inspection task: use `get`, `is`, `find`, `snapshot`, or `screenshot`.
- State setup task: use `cookies`, `storage`, or `state` before continuing.
- Download, dialog, tab, or frame task: switch to the relevant specialized playbook.
4. After every mutating action, verify with at least one explicit assertion.
Preferred assertion order:
1. `wait --url-contains`
2. `wait --text`
3. `is visible`
4. `get value`
5. `get count`
6. `errors list` when debugging or validating stability
7. If the assertion fails, run the failure protocol before making claims.
### Form playbook
Use this for login, signup, checkout, search, and settings forms.
1. Wait for the first required field.
2. Populate fields with `fill` unless the task is specifically about typing behavior.
3. Submit with `click`, `press Enter`, or the control the user would actually use.
4. Verify success using URL, text, and visibility checks.
```bash
cmux browser surface:<ID> wait --selector "#email" --timeout-ms 10000
cmux browser surface:<ID> fill "#email" --text "$TEST_EMAIL"
cmux browser surface:<ID> fill "#password" --text "$TEST_PASSWORD"
cmux browser surface:<ID> click "button[type='submit']" --snapshot-after
cmux browser surface:<ID> wait --url-contains "/dashboard" --timeout-ms 10000
cmux browser surface:<ID> is visible "#dashboard"
```
If submission should fail, verify the expected error text or blocked state instead of forcing success assertions.
### Navigation playbook
Use this when the task is about routing, links, history, or page transitions.
1. Trigger navigation.
2. Wait for URL or page content to settle.
3. Verify the destination.
```bash
cmux browser surface:<ID> navigate https://example.com/docs --snapshot-after
cmux browser surface:<ID> wait --url-contains "/docs" --timeout-ms 10000
cmux browser surface:<ID> get title
```
### Inspection playbook
Use this when the task is observational rather than interactive.
1. Prefer `find` to discover the right selector.
2. Use `get` or `is` for structured assertions.
3. Use `snapshot` or `screenshot` only when human review is useful.
### State playbook
Use this when the task depends on auth, persisted preferences, or reproducible setup.
1. Decide whether the task should start fresh or with persisted state.
2. If fresh-session behavior matters, do not load saved state.
3. If setup reuse is justified, use `cookies`, `storage`, or `state load`.
4. After state changes, reload or navigate as needed and verify the expected state is visible.
### Tabs, dialogs, frames, and downloads playbook
- Tabs: `tab list`, `tab new`, `tab switch`, then verify with `get url` or visible text.
- Dialogs: trigger the dialog, then immediately `dialog accept` or `dialog dismiss`.
- Frames: enter with `frame <selector>`, complete the work, then return with `frame main`.
- Downloads: trigger the download, run `download --path ...`, then verify the file with shell tools if needed.
## Failure protocol
Run this whenever a flow fails or results are ambiguous:
```bash
cmux browser surface:<ID> console list
cmux browser surface:<ID> errors list
cmux browser surface:<ID> screenshot --out /tmp/cmux-failure.png
cmux browser surface:<ID> snapshot --interactive --compact
cmux browser surface:<ID> get url
cmux browser surface:<ID> get title
```
Report all of these if available:
- Failed action
- Expected behavior
- Actual behavior
- Current URL
- Current title
- Relevant visible text
- Console findings
- Browser error findings
- Artifact paths
## Flaky triage
If an interaction fails, use this exact order:
1. Check existence: `cmux browser surface:<ID> get count "<selector>"`
2. Check visibility: `cmux browser surface:<ID> is visible "<selector>"`
3. Check enabled state: `cmux browser surface:<ID> is enabled "<selector>"`
4. Scroll into view: `cmux browser surface:<ID> scroll-into-view "<selector>"`
5. Retry the action once
6. If it still fails, stop retrying and run the failure protocol
## Reporting format
Use this concise result format:
```md
## Browser Test Result
Status: PASS | FAIL | BLOCKED
Tested URL: <url>
Scenario: <what was tested>
Result: <one-sentence outcome>
Evidence:
- <assertion or command result>
- <assertion or command result>
Artifacts:
- Screenshot: <path or none>
- Snapshot: <brief summary or none>
- Console/errors: <brief summary>
Notes:
<caveats, blockers, or likely cause if supported by evidence>
```
## Common mistakes
- Interacting before the page or element is ready
- Assuming navigation succeeded without verifying URL or text
- Using brittle selectors when stable ones exist
- Treating `eval` as a substitute for user behavior
- Ignoring `console` or `errors` output during failures
- Reusing saved state in tests that should start fresh
- Reporting success without at least one explicit assertion
## Gotcha: implicit Enter form submission
Some forms rely on the browsers default behavior where pressing Enter in an input submits the form. In `cmux`, `fill`, `focus`, and `press Enter` may all appear to succeed without actually triggering submission.
Rules:
- Never assume Enter submitted the form just because `press Enter` returned `OK`.
- Always verify submission through URL change, visible success state, or expected content.
- If Enter does not submit and there are no console/browser errors, suspect an automation limitation.
- Retry once with `focus`, `focus-webview`, and `type "...\\n"`.
- If still unsuccessful, validate the feature through the expected destination state and report that implicit Enter submission could not be reliably proven in `cmux`.