agent-browser
Browser Automation for AI Agents
Also available from: inference-sh-9,inference-shell,inferen-sh,inf-sh,inference-sh-8,inferencesh,inference-sh-0,supercent-io,tul-sh,vercel-labs
Enable AI agents to automate web browsing tasks including form filling, data extraction, screenshot capture, and video recording through a simple command-line interface.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "agent-browser". Open https://example.com and show elements
Expected outcome:
Session started with ID: abc123
Interactive elements found:
- @e1 [a] "Example Domain" href="/"
- @e2 [h1] "Example Domain"
- @e3 [p] "This domain is for use in illustrative examples..."
- @e4 [a] "More information..." href="https://www.iana.org/domains/example"
Using "agent-browser". Take a screenshot of the current page
Expected outcome:
Screenshot saved to: /tmp/screenshot_20240115_143022.png
Page title: Example Domain
Viewport: 1280x720
Security Audit
Low RiskThis is a legitimate browser automation skill that uses inference.sh with Playwright. The static findings (external_commands, network, filesystem) are expected behavior for browser automation and represent documentation examples showing CLI usage, not actual security vulnerabilities. No malicious intent detected.
High Risk Issues (1)
Medium Risk Issues (1)
Low Risk Issues (2)
Risk Factors
⚙️ External commands (1)
🌐 Network access (1)
📁 Filesystem access (1)
Quality Score
What You Can Build
Automated Web Testing
AI agents can navigate to web applications, fill test forms, verify UI elements, and capture test results as screenshots or video.
Data Extraction and Research
Extract structured data from websites by navigating pages, identifying elements, and collecting information programmatically.
Form Automation Workflows
Automate repetitive form filling tasks like data entry, application submissions, and bulk operations across multiple pages.
Try These Prompts
Use the browser automation skill to open https://example.com and show me all the interactive elements on the page with their references.
Navigate to the login page at [URL], fill in the email field with user@example.com, fill the password field with mypassword, then click the submit button. Take a screenshot after submission.
Open the page at [URL], identify all table rows in the data table, and extract the text content from each row. Return the data as a structured list.
Start a new browser session with video recording enabled. Navigate through these steps: [list steps], then close the session and provide the video file path.
Best Practices
- Use element references (@e1, @e2) for reliable element targeting instead of CSS selectors
- Take snapshots after each navigation or significant page change to get fresh element references
- Enable video recording for debugging complex automation workflows
- Use proxy settings when testing geo-restricted content or need anonymity
Avoid
- Do not rely on element positions or coordinates - use @e refs instead for stable targeting
- Avoid long wait times; use explicit waits for element visibility rather than fixed delays
- Do not skip re-snapshotting after page navigation - element refs become stale
- Avoid uploading sensitive files without verifying the target website accepts uploads