Skills agent-browser
🌐

agent-browser

Low Risk ⚙️ External commands🌐 Network access📁 Filesystem access

Browser Automation for AI Agents

Also available from: inference-sh-9,inference-shell,inferen-sh,inf-sh,inference-sh-8,inferencesh,inference-sh-0,supercent-io,tul-sh,vercel-labs

Enable AI agents to automate web browsing tasks including form filling, data extraction, screenshot capture, and video recording through a simple command-line interface.

Supports: Claude Codex Code(CC)
⚠️ 68 Poor
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "agent-browser". Open https://example.com and show elements

Expected outcome:

Session started with ID: abc123

Interactive elements found:
- @e1 [a] "Example Domain" href="/"
- @e2 [h1] "Example Domain"
- @e3 [p] "This domain is for use in illustrative examples..."
- @e4 [a] "More information..." href="https://www.iana.org/domains/example"

Using "agent-browser". Take a screenshot of the current page

Expected outcome:

Screenshot saved to: /tmp/screenshot_20240115_143022.png

Page title: Example Domain
Viewport: 1280x720

Security Audit

Low Risk
v1 • 3/8/2026

This is a legitimate browser automation skill that uses inference.sh with Playwright. The static findings (external_commands, network, filesystem) are expected behavior for browser automation and represent documentation examples showing CLI usage, not actual security vulnerabilities. No malicious intent detected.

10
Files scanned
2,312
Lines analyzed
7
findings
1
Total audits

High Risk Issues (1)

Heuristic Warning: Browser Automation Capabilities
The skill combines browser automation, network access, and credential handling. This is expected behavior for a browser automation tool and represents legitimate functionality.
Medium Risk Issues (1)
Shell Command Documentation
The skill documentation shows example shell commands using infsh CLI. These are documentation examples, not actual code execution vulnerabilities.
Low Risk Issues (2)
Network Access for Browser Automation
The skill requires network access to navigate websites. This is expected behavior for browser automation.
Filesystem Access for Screenshots and Videos
The skill can save screenshots and recordings to filesystem. This is expected functionality for a browser automation tool.

Risk Factors

⚙️ External commands (1)
🌐 Network access (1)
📁 Filesystem access (1)
Audited by: claude

Quality Score

45
Architecture
100
Maintainability
87
Content
34
Community
71
Security
91
Spec Compliance

What You Can Build

Automated Web Testing

AI agents can navigate to web applications, fill test forms, verify UI elements, and capture test results as screenshots or video.

Data Extraction and Research

Extract structured data from websites by navigating pages, identifying elements, and collecting information programmatically.

Form Automation Workflows

Automate repetitive form filling tasks like data entry, application submissions, and bulk operations across multiple pages.

Try These Prompts

Open Website and Get Elements
Use the browser automation skill to open https://example.com and show me all the interactive elements on the page with their references.
Fill Form and Submit
Navigate to the login page at [URL], fill in the email field with user@example.com, fill the password field with mypassword, then click the submit button. Take a screenshot after submission.
Extract Data from Table
Open the page at [URL], identify all table rows in the data table, and extract the text content from each row. Return the data as a structured list.
Record Workflow Video
Start a new browser session with video recording enabled. Navigate through these steps: [list steps], then close the session and provide the video file path.

Best Practices

  • Use element references (@e1, @e2) for reliable element targeting instead of CSS selectors
  • Take snapshots after each navigation or significant page change to get fresh element references
  • Enable video recording for debugging complex automation workflows
  • Use proxy settings when testing geo-restricted content or need anonymity

Avoid

  • Do not rely on element positions or coordinates - use @e refs instead for stable targeting
  • Avoid long wait times; use explicit waits for element visibility rather than fixed delays
  • Do not skip re-snapshotting after page navigation - element refs become stale
  • Avoid uploading sensitive files without verifying the target website accepts uploads

Frequently Asked Questions

What is inference.sh and do I need an account?
Inference.sh is the underlying service that provides browser automation capabilities. You need to install the infsh CLI and configure it with your account credentials to use this skill.
Can this skill bypass login forms or CAPTCHAs?
No, this skill cannot bypass authentication systems or CAPTCHAs. It can only interact with web pages programmatically after you provide credentials or when authentication is already handled.
How do element references (@e1, @e2) work?
Element references are assigned by the snapshot function. Each time you call snapshot, you get a fresh list of interactive elements with their @e prefixes. Use these references in subsequent interact commands.
Can I run multiple browser sessions simultaneously?
Yes, each session has a unique session ID. You can manage multiple sessions in parallel by using different session identifiers.
What browsers are supported?
The skill uses Playwright under the hood, supporting Chromium, Firefox, and WebKit. The default is Chromium for maximum compatibility.
How do I handle dynamic content that loads slowly?
Use the 'wait' action with milliseconds, or use the 'wait_for' option in the interact function. You can also execute JavaScript to wait for specific conditions.