Skills agent-browser

📦

agent-browser

Name: agent-browser
Author: inferen-sh

Low Risk 🌐 Network access📁 Filesystem access⚙️ External commands

Automate web browsers for AI agents

Also available from: inference-sh-9,inference-shell,inf-sh,toolshell,inference-sh-8,inferencesh,skillssh,inference-sh-0,supercent-io,tul-sh,vercel-labs

Enable AI assistants to interact with websites, fill forms, extract data, and perform web automation tasks programmatically. Control headless browsers through simple CLI commands with session management and video recording.

Supports: Claude Codex Code(CC)

📊 69 Adequate

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "agent-browser". Open https://example.com and show available elements

Expected outcome:

Session started: sess_abc123
Page loaded: https://example.com
Interactive elements found:
@e1 [h1] "Example Domain"
@e2 [p] "This domain is for use in documentation"
@e3 [a] "Learn more" href="https://iana.org/domains/example"
Screenshot captured successfully

Using "agent-browser". Fill form and submit

Expected outcome:

Filled @e1 with: John Doe
Filled @e2 with: john@example.com
Clicked @e3 (Submit button)
Navigation detected - re-snapshot recommended
Form submission confirmed - thank you page displayed

Security Audit

Low Risk

v1 • 3/20/2026

Static analyzer flagged 606 patterns in documentation files (markdown and shell templates). All findings are false positives: shell command patterns appear in documentation examples showing how to use the inference.sh CLI, not in executable skill code. Network URLs point to legitimate services (inference.sh, example domains). Path references are documentation placeholders. The skill legitimately uses bash commands to invoke the infsh CLI for browser automation via the allowed-tools mechanism.

Files scanned

2,313

Lines analyzed

findings

Total audits

Low Risk Issues (3)

SKILL.md:17-58 references/authentication.md:20-297 references/commands.md:9-271

External Command Execution in Documentation

Shell command patterns (backticks, command substitution) detected in markdown documentation files. These are instructional examples showing users how to invoke the infsh CLI, not executable code within the skill itself. Pattern is benign when contained in documentation.

SKILL.md:11-15 references/proxy-support.md:23-259

Hardcoded URLs in Documentation

Multiple URLs detected in documentation files including inference.sh service endpoints and example.com domains. These are legitimate service URLs and documentation placeholders, not malicious endpoints.

SKILL.md:164 templates/authenticated-session.sh:30

Path References in Examples

Path traversal-like sequences detected in documentation and templates. These are placeholder paths (/path/to/file) showing users where to substitute their own values, not actual path traversal vulnerabilities.

Risk Factors

🌐 Network access (5)

SKILL.md:11 SKILL.md:15 SKILL.md:21 references/authentication.md:25 references/authentication.md:71

📁 Filesystem access (3)

SKILL.md:164 templates/authenticated-session.sh:30 templates/capture-workflow.sh:28

⚙️ External commands (5)

SKILL.md:9 SKILL.md:15 SKILL.md:17-58 templates/authenticated-session.sh:40-43 templates/capture-workflow.sh:28-31

Audited by: claude

Quality Score

Architecture

100

Maintainability

Content

Community

Security

Spec Compliance

What You Can Build

Automated Form Submission

Fill and submit web forms for data entry, registration, or contact workflows. Supports validation feedback and error handling.

Web Data Extraction

Extract structured data from websites for research, price monitoring, or content aggregation. Captures screenshots alongside extracted content.

End-to-End Testing

Test web application workflows by simulating user interactions. Record sessions for debugging and documentation purposes.

Try These Prompts

Basic Navigation and Screenshot

Open the URL https://example.com and take a full-page screenshot. Show me what elements are available for interaction.

Form Automation

Navigate to the contact form at https://example.com/contact. Fill in the name field with 'John Doe', email with 'john@example.com', message with 'Hello', then submit the form. Confirm submission succeeded.

Data Extraction with Session

Open https://news.example.com and extract all article headlines from the homepage. Take a screenshot of the page and save the extracted text to a file. Keep the session open for follow-up queries.

Authenticated Workflow with Video

Start a recorded browser session. Navigate to the login page, fill credentials from environment variables, submit and verify successful authentication. Navigate to the dashboard, extract the welcome message and user data, then close and return the video file.

Best Practices

Always capture element references after navigation since refs invalidate on page changes
Use environment variables for credentials and sensitive data, never hardcode in templates
Enable video recording only for debugging to reduce resource overhead
Close sessions explicitly or use cleanup traps to prevent resource leaks

Avoid

Hardcoding usernames, passwords, or API keys directly in script files
Assuming element refs persist across page loads without re-snapshot
Logging or echoing sensitive values like passwords or session tokens
Leaving browser sessions open after workflow completion

Frequently Asked Questions

What is inference.sh and do I need an account?

inference.sh is a browser automation service that provides the backend for this skill. Yes, you need to create an account and install the infsh CLI. Run 'infsh login' to authenticate before using browser commands.

How do element references (@e1, @e2) work?

The browser assigns @e1, @e2, etc. to interactive elements on each page snapshot. References change when the page reloads or DOM updates, so always call snapshot after navigation to get fresh refs.

Can this skill bypass CAPTCHA or login security?

No. CAPTCHA requires human verification. For 2FA, you can either generate TOTP codes programmatically if you have the secret, or pause for manual intervention when SMS/hardware token codes are needed.

Where are video recordings stored?

Videos are returned as file objects when you close a session with record_video enabled. The file is provided in the API response and can be saved or processed by subsequent skill commands.

Can I run multiple browser sessions in parallel?

Yes, each session has an independent session_id. Start multiple sessions with --session new and track their IDs separately. Remember to close each session to free resources.

Does this work with headless browsers?

Yes, the underlying Playwright engine runs headless by default. This means no visible browser window appears during automation, making it suitable for server environments and CI/CD pipelines.