Skills agent-browser
📦

agent-browser

Low Risk 🌐 Network access📁 Filesystem access⚙️ External commands

Automate web browsers for AI agents

Also available from: inference-sh-9,inference-shell,inf-sh,toolshell,inference-sh-8,inferencesh,skillssh,inference-sh-0,supercent-io,tul-sh,vercel-labs

Enable AI assistants to interact with websites, fill forms, extract data, and perform web automation tasks programmatically. Control headless browsers through simple CLI commands with session management and video recording.

Supports: Claude Codex Code(CC)
📊 69 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "agent-browser". Open https://example.com and show available elements

Expected outcome:

  • Session started: sess_abc123
  • Page loaded: https://example.com
  • Interactive elements found:
  • @e1 [h1] "Example Domain"
  • @e2 [p] "This domain is for use in documentation"
  • @e3 [a] "Learn more" href="https://iana.org/domains/example"
  • Screenshot captured successfully

Using "agent-browser". Fill form and submit

Expected outcome:

  • Filled @e1 with: John Doe
  • Filled @e2 with: john@example.com
  • Clicked @e3 (Submit button)
  • Navigation detected - re-snapshot recommended
  • Form submission confirmed - thank you page displayed

Security Audit

Low Risk
v1 • 3/20/2026

Static analyzer flagged 606 patterns in documentation files (markdown and shell templates). All findings are false positives: shell command patterns appear in documentation examples showing how to use the inference.sh CLI, not in executable skill code. Network URLs point to legitimate services (inference.sh, example domains). Path references are documentation placeholders. The skill legitimately uses bash commands to invoke the infsh CLI for browser automation via the allowed-tools mechanism.

10
Files scanned
2,313
Lines analyzed
6
findings
1
Total audits
Low Risk Issues (3)
External Command Execution in Documentation
Shell command patterns (backticks, command substitution) detected in markdown documentation files. These are instructional examples showing users how to invoke the infsh CLI, not executable code within the skill itself. Pattern is benign when contained in documentation.
Hardcoded URLs in Documentation
Multiple URLs detected in documentation files including inference.sh service endpoints and example.com domains. These are legitimate service URLs and documentation placeholders, not malicious endpoints.
Path References in Examples
Path traversal-like sequences detected in documentation and templates. These are placeholder paths (/path/to/file) showing users where to substitute their own values, not actual path traversal vulnerabilities.
Audited by: claude

Quality Score

45
Architecture
100
Maintainability
87
Content
22
Community
84
Security
91
Spec Compliance

What You Can Build

Automated Form Submission

Fill and submit web forms for data entry, registration, or contact workflows. Supports validation feedback and error handling.

Web Data Extraction

Extract structured data from websites for research, price monitoring, or content aggregation. Captures screenshots alongside extracted content.

End-to-End Testing

Test web application workflows by simulating user interactions. Record sessions for debugging and documentation purposes.

Try These Prompts

Basic Navigation and Screenshot
Open the URL https://example.com and take a full-page screenshot. Show me what elements are available for interaction.
Form Automation
Navigate to the contact form at https://example.com/contact. Fill in the name field with 'John Doe', email with 'john@example.com', message with 'Hello', then submit the form. Confirm submission succeeded.
Data Extraction with Session
Open https://news.example.com and extract all article headlines from the homepage. Take a screenshot of the page and save the extracted text to a file. Keep the session open for follow-up queries.
Authenticated Workflow with Video
Start a recorded browser session. Navigate to the login page, fill credentials from environment variables, submit and verify successful authentication. Navigate to the dashboard, extract the welcome message and user data, then close and return the video file.

Best Practices

  • Always capture element references after navigation since refs invalidate on page changes
  • Use environment variables for credentials and sensitive data, never hardcode in templates
  • Enable video recording only for debugging to reduce resource overhead
  • Close sessions explicitly or use cleanup traps to prevent resource leaks

Avoid

  • Hardcoding usernames, passwords, or API keys directly in script files
  • Assuming element refs persist across page loads without re-snapshot
  • Logging or echoing sensitive values like passwords or session tokens
  • Leaving browser sessions open after workflow completion

Frequently Asked Questions

What is inference.sh and do I need an account?
inference.sh is a browser automation service that provides the backend for this skill. Yes, you need to create an account and install the infsh CLI. Run 'infsh login' to authenticate before using browser commands.
How do element references (@e1, @e2) work?
The browser assigns @e1, @e2, etc. to interactive elements on each page snapshot. References change when the page reloads or DOM updates, so always call snapshot after navigation to get fresh refs.
Can this skill bypass CAPTCHA or login security?
No. CAPTCHA requires human verification. For 2FA, you can either generate TOTP codes programmatically if you have the secret, or pause for manual intervention when SMS/hardware token codes are needed.
Where are video recordings stored?
Videos are returned as file objects when you close a session with record_video enabled. The file is provided in the API response and can be saved or processed by subsequent skill commands.
Can I run multiple browser sessions in parallel?
Yes, each session has an independent session_id. Start multiple sessions with --session new and track their IDs separately. Remember to close each session to free resources.
Does this work with headless browsers?
Yes, the underlying Playwright engine runs headless by default. This means no visible browser window appears during automation, making it suitable for server environments and CI/CD pipelines.