diff --git a/browser-automation/SKILL.md b/browser-automation/SKILL.md new file mode 100644 index 0000000..1984f00 --- /dev/null +++ b/browser-automation/SKILL.md @@ -0,0 +1,394 @@ +--- +name: browser-automation +description: Use cmux browser automation to validate real user-facing behavior in browser surfaces, including navigation, DOM interaction, inspection, storage, tabs, dialogs, frames, downloads, and browser logs. +--- + +# Browser Testing with cmux + +Use this skill for browser-driven validation through `cmux browser`. + +Goal: verify real user-facing behavior, not implementation details. + +## When to use + +Use this skill when the task involves any of these: + +- Opening or navigating a page +- Testing a UI flow end to end +- Filling forms or clicking controls +- Verifying visible text, URL, title, values, attributes, or counts +- Capturing snapshots or screenshots +- Checking console logs or browser errors +- Working with cookies, storage, or saved browser state +- Handling tabs, dialogs, iframes, or downloads + +Do not use this skill for unit tests, static analysis, or API-only checks unless browser behavior is part of the task. + +## Operating rules + +1. Start by identifying or opening the browser surface you will use. +2. Wait for a stable state before interacting. +3. Prefer stable selectors and structured getters over visual guesswork. +4. After each mutating action, verify the result with URL, text, visibility, or value checks. +5. Treat `console` and `errors` output as evidence, not optional noise. +6. On failure, collect artifacts before concluding root cause. +7. Use `eval` only when the browser commands cannot express the check directly. +8. Keep credentials and secrets out of logs, commands, screenshots, and saved state when possible. + +## Verified command surface + +The commands below are verified against the official cmux browser automation docs. + +### Targeting a browser surface + +```bash +cmux browser open https://example.com +cmux browser open-split https://example.com +cmux browser identify +cmux browser identify --surface surface:2 +cmux browser surface:2 url +cmux browser --surface surface:2 url +``` + +### Navigation and focus + +```bash +cmux browser surface:2 navigate https://example.com/docs --snapshot-after +cmux browser surface:2 back +cmux browser surface:2 forward +cmux browser surface:2 reload --snapshot-after +cmux browser surface:2 focus-webview +cmux browser surface:2 is-webview-focused +``` + +### Waiting + +```bash +cmux browser surface:2 wait --load-state complete --timeout-ms 15000 +cmux browser surface:2 wait --selector "#checkout" --timeout-ms 10000 +cmux browser surface:2 wait --text "Order confirmed" +cmux browser surface:2 wait --url-contains "/dashboard" +cmux browser surface:2 wait --function "window.__appReady === true" +``` + +Prefer this order of readiness checks: + +1. `--selector` when a specific element gates the next action +2. `--text` for user-visible confirmation +3. `--url-contains` for navigation assertions +4. `--function` only when the app exposes a reliable readiness flag +5. `--load-state complete` for initial page load or simple pages + +Avoid fixed sleeps unless there is no reliable signal. + +### DOM interaction + +```bash +cmux browser surface:2 click "button[type='submit']" --snapshot-after +cmux browser surface:2 dblclick ".item-row" +cmux browser surface:2 hover "#menu" +cmux browser surface:2 focus "#email" +cmux browser surface:2 check "#terms" +cmux browser surface:2 uncheck "#newsletter" +cmux browser surface:2 scroll-into-view "#pricing" +cmux browser surface:2 type "#search" "cmux" +cmux browser surface:2 fill "#email" --text "ops@example.com" +cmux browser surface:2 press Enter +cmux browser surface:2 keydown Shift +cmux browser surface:2 keyup Shift +cmux browser surface:2 select "#region" "us-east" +cmux browser surface:2 scroll --dy 800 --snapshot-after +cmux browser surface:2 scroll --selector "#log-view" --dx 0 --dy 400 +``` + +Prefer `fill` when setting a known final value. + +Prefer `type` when testing keystroke-driven behavior such as debouncing, masking, shortcuts, or suggestions. + +### Inspection and assertions + +```bash +cmux browser surface:2 snapshot --interactive --compact +cmux browser surface:2 snapshot --selector "main" --max-depth 5 +cmux browser surface:2 screenshot --out /tmp/cmux-page.png + +cmux browser surface:2 get title +cmux browser surface:2 get url +cmux browser surface:2 get text "h1" +cmux browser surface:2 get html "main" +cmux browser surface:2 get value "#email" +cmux browser surface:2 get attr "a.primary" --attr href +cmux browser surface:2 get count ".row" +cmux browser surface:2 get box "#checkout" +cmux browser surface:2 get styles "#total" --property color + +cmux browser surface:2 is visible "#checkout" +cmux browser surface:2 is enabled "button[type='submit']" +cmux browser surface:2 is checked "#terms" + +cmux browser surface:2 find role button --name "Continue" +cmux browser surface:2 find text "Order confirmed" +cmux browser surface:2 find label "Email" +cmux browser surface:2 find placeholder "Search" +cmux browser surface:2 find alt "Product image" +cmux browser surface:2 find title "Open settings" +cmux browser surface:2 find testid "save-btn" +cmux browser surface:2 find first ".row" +cmux browser surface:2 find last ".row" +cmux browser surface:2 find nth 2 ".row" + +cmux browser surface:2 highlight "#checkout" +``` + +Preferred selector order: + +1. Stable test IDs +2. Accessible role and name +3. Labels and placeholders +4. Semantic CSS selectors +5. Text selectors + +Avoid brittle selectors like deeply nested `nth-child` chains unless no better option exists. + +### JavaScript and injection + +```bash +cmux browser surface:2 eval "document.title" +cmux browser surface:2 eval --script "window.location.href" +cmux browser surface:2 addinitscript "window.__cmuxReady = true;" +cmux browser surface:2 addscript "document.querySelector('#name')?.focus()" +cmux browser surface:2 addstyle "#debug-banner { display: none !important; }" +``` + +Use `eval` sparingly. Do not bypass the UI path if the task is to validate user behavior. + +### State and session data + +```bash +cmux browser surface:2 cookies get +cmux browser surface:2 cookies get --name session_id +cmux browser surface:2 cookies set session_id abc123 --domain example.com --path / +cmux browser surface:2 cookies clear --name session_id +cmux browser surface:2 cookies clear --all + +cmux browser surface:2 storage local set theme dark +cmux browser surface:2 storage local get theme +cmux browser surface:2 storage local clear +cmux browser surface:2 storage session set flow onboarding +cmux browser surface:2 storage session get flow + +cmux browser surface:2 state save /tmp/cmux-browser-state.json +cmux browser surface:2 state load /tmp/cmux-browser-state.json +``` + +Use saved browser state for authenticated sessions, long setup flows, and repeatable bug repros. + +Do not use saved state when validating fresh-session behavior like login, onboarding, logout, or first-run UX. + +### Tabs, logs, dialogs, frames, downloads + +```bash +cmux browser surface:2 tab list +cmux browser surface:2 tab new https://example.com/pricing +cmux browser surface:2 tab switch 1 +cmux browser surface:2 tab switch surface:7 +cmux browser surface:2 tab close +cmux browser surface:2 tab close surface:7 + +cmux browser surface:2 console list +cmux browser surface:2 console clear +cmux browser surface:2 errors list +cmux browser surface:2 errors clear + +cmux browser surface:2 dialog accept +cmux browser surface:2 dialog accept "Confirmed by automation" +cmux browser surface:2 dialog dismiss + +cmux browser surface:2 frame "iframe[name='checkout']" +cmux browser surface:2 frame main + +cmux browser surface:2 download --path /tmp/report.csv --timeout-ms 30000 +``` + +## Execution playbooks + +### Default flow + +Use this decision sequence unless the task clearly needs a different one. + +1. Acquire a surface. + +```bash +cmux browser open +cmux browser identify +``` + +2. Establish readiness. + +```bash +cmux browser surface: wait --load-state complete --timeout-ms 15000 +cmux browser surface: snapshot --interactive --compact +``` + +3. Choose the next action type. + +- Navigation task: use `navigate`, `back`, `forward`, or `reload`, then wait again. +- Form task: use the form playbook below. +- Inspection task: use `get`, `is`, `find`, `snapshot`, or `screenshot`. +- State setup task: use `cookies`, `storage`, or `state` before continuing. +- Download, dialog, tab, or frame task: switch to the relevant specialized playbook. + +4. After every mutating action, verify with at least one explicit assertion. + +Preferred assertion order: + +1. `wait --url-contains` +2. `wait --text` +3. `is visible` +4. `get value` +5. `get count` +6. `errors list` when debugging or validating stability + +7. If the assertion fails, run the failure protocol before making claims. + +### Form playbook + +Use this for login, signup, checkout, search, and settings forms. + +1. Wait for the first required field. +2. Populate fields with `fill` unless the task is specifically about typing behavior. +3. Submit with `click`, `press Enter`, or the control the user would actually use. +4. Verify success using URL, text, and visibility checks. + +```bash +cmux browser surface: wait --selector "#email" --timeout-ms 10000 +cmux browser surface: fill "#email" --text "$TEST_EMAIL" +cmux browser surface: fill "#password" --text "$TEST_PASSWORD" +cmux browser surface: click "button[type='submit']" --snapshot-after +cmux browser surface: wait --url-contains "/dashboard" --timeout-ms 10000 +cmux browser surface: is visible "#dashboard" +``` + +If submission should fail, verify the expected error text or blocked state instead of forcing success assertions. + +### Navigation playbook + +Use this when the task is about routing, links, history, or page transitions. + +1. Trigger navigation. +2. Wait for URL or page content to settle. +3. Verify the destination. + +```bash +cmux browser surface: navigate https://example.com/docs --snapshot-after +cmux browser surface: wait --url-contains "/docs" --timeout-ms 10000 +cmux browser surface: get title +``` + +### Inspection playbook + +Use this when the task is observational rather than interactive. + +1. Prefer `find` to discover the right selector. +2. Use `get` or `is` for structured assertions. +3. Use `snapshot` or `screenshot` only when human review is useful. + +### State playbook + +Use this when the task depends on auth, persisted preferences, or reproducible setup. + +1. Decide whether the task should start fresh or with persisted state. +2. If fresh-session behavior matters, do not load saved state. +3. If setup reuse is justified, use `cookies`, `storage`, or `state load`. +4. After state changes, reload or navigate as needed and verify the expected state is visible. + +### Tabs, dialogs, frames, and downloads playbook + +- Tabs: `tab list`, `tab new`, `tab switch`, then verify with `get url` or visible text. +- Dialogs: trigger the dialog, then immediately `dialog accept` or `dialog dismiss`. +- Frames: enter with `frame `, complete the work, then return with `frame main`. +- Downloads: trigger the download, run `download --path ...`, then verify the file with shell tools if needed. + +## Failure protocol + +Run this whenever a flow fails or results are ambiguous: + +```bash +cmux browser surface: console list +cmux browser surface: errors list +cmux browser surface: screenshot --out /tmp/cmux-failure.png +cmux browser surface: snapshot --interactive --compact +cmux browser surface: get url +cmux browser surface: get title +``` + +Report all of these if available: + +- Failed action +- Expected behavior +- Actual behavior +- Current URL +- Current title +- Relevant visible text +- Console findings +- Browser error findings +- Artifact paths + +## Flaky triage + +If an interaction fails, use this exact order: + +1. Check existence: `cmux browser surface: get count ""` +2. Check visibility: `cmux browser surface: is visible ""` +3. Check enabled state: `cmux browser surface: is enabled ""` +4. Scroll into view: `cmux browser surface: scroll-into-view ""` +5. Retry the action once +6. If it still fails, stop retrying and run the failure protocol + +## Reporting format + +Use this concise result format: + +```md +## Browser Test Result + +Status: PASS | FAIL | BLOCKED +Tested URL: +Scenario: +Result: + +Evidence: + +- +- + +Artifacts: + +- Screenshot: +- Snapshot: +- Console/errors: + +Notes: + +``` + +## Common mistakes + +- Interacting before the page or element is ready +- Assuming navigation succeeded without verifying URL or text +- Using brittle selectors when stable ones exist +- Treating `eval` as a substitute for user behavior +- Ignoring `console` or `errors` output during failures +- Reusing saved state in tests that should start fresh +- Reporting success without at least one explicit assertion + +## Gotcha: implicit Enter form submission + +Some forms rely on the browser’s default behavior where pressing Enter in an input submits the form. In `cmux`, `fill`, `focus`, and `press Enter` may all appear to succeed without actually triggering submission. +Rules: + +- Never assume Enter submitted the form just because `press Enter` returned `OK`. +- Always verify submission through URL change, visible success state, or expected content. +- If Enter does not submit and there are no console/browser errors, suspect an automation limitation. +- Retry once with `focus`, `focus-webview`, and `type "...\\n"`. +- If still unsuccessful, validate the feature through the expected destination state and report that implicit Enter submission could not be reliably proven in `cmux`. diff --git a/dev-server/SKILL.md b/dev-server/SKILL.md new file mode 100644 index 0000000..e667df1 --- /dev/null +++ b/dev-server/SKILL.md @@ -0,0 +1,20 @@ +--- +name: dev-server +description: Instructions on how to run correctly development / preview servers or background processes in new panes in cmux. +--- + +## When to use + +This skill is active when the user asks directly or the agent wants to run a local dev/preview server or any other long running command that ideally needs to run in the background. +The idea is to provide a dedicated cmux pane / split where to execute this task, and skip entirely the background execution in the coding agent. + +## How to use + +1. Identify what is the task to run. Examples are `npm run dev`, `uv run manage.py runserver`, `python -m http.server` and so on. +2. Create a new split in cmux. By default we want to split right: `cmux new-split right`. +3. After the split is created, it will return some information like so: `OK surface:23 workspace:5`. +4. We want to rename the split to give it a meaningful name, for example "Django dev server" or "npm dev server": `cmux rename-tab --surface surface:23 "npm dev server"` +5. Now we need to send the actual command to the split: `cmux send --surface surface:23 "npm run dev"`. +6. Lastly, we need to send the key to trigger the command execution: `cmux send-key --surface surface:23 enter`. + +The external process now is up and running. It will be up to the user to terminate or close the split.