Skills agent-browser

🌐

agent-browser

Name: agent-browser
Author: toolshell

Low Risk ⚙️ External commands🌐 Network access📁 Filesystem access

Browser Automation for AI Agents

Also available from: inference-sh-9,inference-shell,inferen-sh,inf-sh,inference-sh-8,inferencesh,inference-sh-0,supercent-io,tul-sh,vercel-labs

Enable AI agents to automate web browsing tasks including form filling, data extraction, screenshot capture, and video recording through a simple command-line interface.

Supports: Claude Codex Code(CC)

⚠️ 68 Poor

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "agent-browser". Open https://example.com and show elements

Expected outcome:

Session started with ID: abc123

Interactive elements found:
- @e1 [a] "Example Domain" href="/"
- @e2 [h1] "Example Domain"
- @e3 [p] "This domain is for use in illustrative examples..."
- @e4 [a] "More information..." href="https://www.iana.org/domains/example"

Using "agent-browser". Take a screenshot of the current page

Expected outcome:

Screenshot saved to: /tmp/screenshot_20240115_143022.png

Page title: Example Domain
Viewport: 1280x720

Security Audit

Low Risk

v1 • 3/8/2026

This is a legitimate browser automation skill that uses inference.sh with Playwright. The static findings (external_commands, network, filesystem) are expected behavior for browser automation and represent documentation examples showing CLI usage, not actual security vulnerabilities. No malicious intent detected.

Files scanned

2,312

Lines analyzed

findings

Total audits

High Risk Issues (1)

SKILL.md:1-10 templates/authenticated-session.sh:1-50

Heuristic Warning: Browser Automation Capabilities

The skill combines browser automation, network access, and credential handling. This is expected behavior for a browser automation tool and represents legitimate functionality.

Medium Risk Issues (1)

SKILL.md:34-60 references/commands.md:1-50

Shell Command Documentation

The skill documentation shows example shell commands using infsh CLI. These are documentation examples, not actual code execution vulnerabilities.

Low Risk Issues (2)

SKILL.md:9-11

Network Access for Browser Automation

The skill requires network access to navigate websites. This is expected behavior for browser automation.

SKILL.md:67-69

Filesystem Access for Screenshots and Videos

The skill can save screenshots and recordings to filesystem. This is expected functionality for a browser automation tool.

Risk Factors

⚙️ External commands (1)

SKILL.md:34-60

🌐 Network access (1)

SKILL.md:9-11

📁 Filesystem access (1)

SKILL.md:67-69

Audited by: claude

Quality Score

Architecture

100

Maintainability

Content

Community

Security

Spec Compliance

What You Can Build

Automated Web Testing

AI agents can navigate to web applications, fill test forms, verify UI elements, and capture test results as screenshots or video.

Data Extraction and Research

Extract structured data from websites by navigating pages, identifying elements, and collecting information programmatically.

Form Automation Workflows

Automate repetitive form filling tasks like data entry, application submissions, and bulk operations across multiple pages.

Try These Prompts

Open Website and Get Elements

Use the browser automation skill to open https://example.com and show me all the interactive elements on the page with their references.

Fill Form and Submit

Navigate to the login page at [URL], fill in the email field with user@example.com, fill the password field with mypassword, then click the submit button. Take a screenshot after submission.

Extract Data from Table

Open the page at [URL], identify all table rows in the data table, and extract the text content from each row. Return the data as a structured list.

Record Workflow Video

Start a new browser session with video recording enabled. Navigate through these steps: [list steps], then close the session and provide the video file path.

Best Practices

Use element references (@e1, @e2) for reliable element targeting instead of CSS selectors
Take snapshots after each navigation or significant page change to get fresh element references
Enable video recording for debugging complex automation workflows
Use proxy settings when testing geo-restricted content or need anonymity

Avoid

Do not rely on element positions or coordinates - use @e refs instead for stable targeting
Avoid long wait times; use explicit waits for element visibility rather than fixed delays
Do not skip re-snapshotting after page navigation - element refs become stale
Avoid uploading sensitive files without verifying the target website accepts uploads

Frequently Asked Questions

What is inference.sh and do I need an account?

Inference.sh is the underlying service that provides browser automation capabilities. You need to install the infsh CLI and configure it with your account credentials to use this skill.

Can this skill bypass login forms or CAPTCHAs?

No, this skill cannot bypass authentication systems or CAPTCHAs. It can only interact with web pages programmatically after you provide credentials or when authentication is already handled.

How do element references (@e1, @e2) work?

Element references are assigned by the snapshot function. Each time you call snapshot, you get a fresh list of interactive elements with their @e prefixes. Use these references in subsequent interact commands.

Can I run multiple browser sessions simultaneously?

Yes, each session has a unique session ID. You can manage multiple sessions in parallel by using different session identifiers.

What browsers are supported?

The skill uses Playwright under the hood, supporting Chromium, Firefox, and WebKit. The default is Chromium for maximum compatibility.

How do I handle dynamic content that loads slowly?

Use the 'wait' action with milliseconds, or use the 'wait_for' option in the interact function. You can also execute JavaScript to wait for specific conditions.