Skills agent-browser

🌐

agent-browser

Name: agent-browser
Author: skillssh

Safe ⚙️ External commands🌐 Network access📁 Filesystem access

Automate web browsing with AI agents

Also available from: toolshell,inference-sh-8,inferencesh,inferen-sh,inference-sh-0,inference-sh-9,supercent-io,inference-shell,tul-sh,inf-sh,vercel-labs

AI agents need to interact with websites but lack browser capabilities. This skill provides headless browser automation via inference.sh, enabling Claude, Codex, and Claude Code to navigate pages, fill forms, take screenshots, and record sessions.

Supports: Claude Codex Code(CC)

🥉 76 Bronze

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "agent-browser". Open https://example.com and identify the login form elements

Expected outcome:

Page loaded successfully. Found 3 interactive elements:
@e1 [input type='text'] placeholder='Username'
@e2 [input type='password'] placeholder='Password'
@e3 [button] 'Sign In'

Using "agent-browser". Fill and submit the login form with test credentials

Expected outcome:

Form submitted. Page redirected to dashboard.
@e1 [h1] 'Welcome, Test User'
@e2 [nav] 'Dashboard | Settings | Logout'
Screenshot captured.

Using "agent-browser". Take a screenshot of the dashboard

Expected outcome:

Screenshot saved to dashboard-20240101.png
Page title: Dashboard | Size: 1280x720
Dashboard contains: navigation menu, user profile card, data tables, action buttons

Security Audit

Safe

v1 • 4/22/2026

All static findings are false positives. The skill uses the inference.sh CLI (infsh) to control a headless browser via documented command invocations. External command detections are hardcoded API calls to a legitimate service. Network detections are target URLs for browsing, not exfiltration. Filesystem detections are documentation navigation (../) and standard device paths. Password/crypto detections are documentation showing credential input handling, not cryptography.

Files scanned

2,313

Lines analyzed

findings

Total audits

Risk Factors

⚙️ External commands (4)

SKILL.md:21-22 references/authentication.md:24-26 references/commands.md:10-11 templates/authenticated-session.sh:40-43

🌐 Network access (4)

SKILL.md:9 SKILL.md:37 references/authentication.md:25 references/commands.md:25

📁 Filesystem access (2)

SKILL.md:195-200 references/authentication.md:5

Audited by: claude

Quality Score

Architecture

100

Maintainability

Content

Community

100

Security

Spec Compliance

What You Can Build

Research and data extraction

AI agents browse websites to gather information, extract structured data from pages, and compile research reports without manual browsing.

Automated form submission

AI agents fill and submit web forms for tasks like booking appointments, registering accounts, or completing batch data entry.

Browser-based testing

QA engineers use AI agents to navigate websites, take screenshots, and record test sessions to verify UI functionality.

Try These Prompts

Basic page navigation

Use the agent-browser skill to open https://example.com and show me all the clickable elements on the page.

Form filling workflow

Open the contact form at https://example.com/contact. Fill in name with 'John Doe', email with 'john@example.com', and submit the form. Take a screenshot of the result.

Authenticated session with data extraction

Login to https://app.example.com using the credentials from environment variables. Navigate to the dashboard, extract all table data, and save a screenshot of the final page.

Multi-page research session

Record a video while browsing example.com/products. Click through 5 products, fill out an inquiry form for the last product, and close the session to save the recording.

Best Practices

Always re-snapshot after navigation or DOM changes; element refs expire after page loads
Use environment variables for credentials; never hardcode passwords in scripts
Close sessions when finished; video recordings are only available until close is called

Avoid

Do not cache element refs across different pages; always snapshot after navigation
Do not hardcode credentials; use environment variables like $APP_USERNAME and $APP_PASSWORD
Do not skip wait times after actions; allow pages to fully load before interacting

Frequently Asked Questions

What is inference.sh and do I need to install it?

Yes, inference.sh is required. It provides the CLI (infsh) that runs the browser automation. Install it from raw.githubusercontent.com/inference-sh/skills/main/cli-install.md

Why do element refs like @e1 stop working?

Element refs are invalidated after page navigation, DOM changes, or dynamic content loading. Always call the snapshot function after these events to get fresh refs.

How do I handle login for protected sites?

Use the agent-browser skill to automate the login flow once, then reuse the session ID for subsequent authenticated requests. The authentication.md reference explains this pattern.

Can I record browser sessions as video?

Yes, enable record_video: true in the open function. Call close to retrieve the video file. The cursor can be shown with show_cursor: true for clearer demos.

How do I upload files through the browser?

Use the upload action with file_paths array. The ref must point to a file input element. Example: {action: upload, ref: @e5, file_paths: ['/path/to/file.pdf']}

What happens if the browser session times out?

Sessions do not persist across server restarts. Always handle errors gracefully and restart the workflow if needed. Video recordings are lost if close is not called before timeout.