agent-browser
Automate web browsing with AI agents
Also available from: toolshell,inference-sh-8,inferencesh,inferen-sh,inference-sh-0,inference-sh-9,supercent-io,inference-shell,tul-sh,inf-sh,vercel-labs
AI agents need to interact with websites but lack browser capabilities. This skill provides headless browser automation via inference.sh, enabling Claude, Codex, and Claude Code to navigate pages, fill forms, take screenshots, and record sessions.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "agent-browser". Open https://example.com and identify the login form elements
Expected outcome:
Page loaded successfully. Found 3 interactive elements:
@e1 [input type='text'] placeholder='Username'
@e2 [input type='password'] placeholder='Password'
@e3 [button] 'Sign In'
Using "agent-browser". Fill and submit the login form with test credentials
Expected outcome:
Form submitted. Page redirected to dashboard.
@e1 [h1] 'Welcome, Test User'
@e2 [nav] 'Dashboard | Settings | Logout'
Screenshot captured.
Using "agent-browser". Take a screenshot of the dashboard
Expected outcome:
Screenshot saved to dashboard-20240101.png
Page title: Dashboard | Size: 1280x720
Dashboard contains: navigation menu, user profile card, data tables, action buttons
Security Audit
SafeAll static findings are false positives. The skill uses the inference.sh CLI (infsh) to control a headless browser via documented command invocations. External command detections are hardcoded API calls to a legitimate service. Network detections are target URLs for browsing, not exfiltration. Filesystem detections are documentation navigation (../) and standard device paths. Password/crypto detections are documentation showing credential input handling, not cryptography.
Risk Factors
⚙️ External commands (4)
🌐 Network access (4)
📁 Filesystem access (2)
Quality Score
What You Can Build
Research and data extraction
AI agents browse websites to gather information, extract structured data from pages, and compile research reports without manual browsing.
Automated form submission
AI agents fill and submit web forms for tasks like booking appointments, registering accounts, or completing batch data entry.
Browser-based testing
QA engineers use AI agents to navigate websites, take screenshots, and record test sessions to verify UI functionality.
Try These Prompts
Use the agent-browser skill to open https://example.com and show me all the clickable elements on the page.
Open the contact form at https://example.com/contact. Fill in name with 'John Doe', email with 'john@example.com', and submit the form. Take a screenshot of the result.
Login to https://app.example.com using the credentials from environment variables. Navigate to the dashboard, extract all table data, and save a screenshot of the final page.
Record a video while browsing example.com/products. Click through 5 products, fill out an inquiry form for the last product, and close the session to save the recording.
Best Practices
- Always re-snapshot after navigation or DOM changes; element refs expire after page loads
- Use environment variables for credentials; never hardcode passwords in scripts
- Close sessions when finished; video recordings are only available until close is called
Avoid
- Do not cache element refs across different pages; always snapshot after navigation
- Do not hardcode credentials; use environment variables like $APP_USERNAME and $APP_PASSWORD
- Do not skip wait times after actions; allow pages to fully load before interacting