agent-browser
Automate web browsers for AI agents
Also available from: inference-sh-9,inference-shell,inf-sh,toolshell,inference-sh-8,inferencesh,skillssh,inference-sh-0,supercent-io,tul-sh,vercel-labs
Enable AI assistants to interact with websites, fill forms, extract data, and perform web automation tasks programmatically. Control headless browsers through simple CLI commands with session management and video recording.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "agent-browser". Open https://example.com and show available elements
Expected outcome:
- Session started: sess_abc123
- Page loaded: https://example.com
- Interactive elements found:
- @e1 [h1] "Example Domain"
- @e2 [p] "This domain is for use in documentation"
- @e3 [a] "Learn more" href="https://iana.org/domains/example"
- Screenshot captured successfully
Using "agent-browser". Fill form and submit
Expected outcome:
- Filled @e1 with: John Doe
- Filled @e2 with: john@example.com
- Clicked @e3 (Submit button)
- Navigation detected - re-snapshot recommended
- Form submission confirmed - thank you page displayed
Security Audit
Low RiskStatic analyzer flagged 606 patterns in documentation files (markdown and shell templates). All findings are false positives: shell command patterns appear in documentation examples showing how to use the inference.sh CLI, not in executable skill code. Network URLs point to legitimate services (inference.sh, example domains). Path references are documentation placeholders. The skill legitimately uses bash commands to invoke the infsh CLI for browser automation via the allowed-tools mechanism.
Low Risk Issues (3)
Risk Factors
🌐 Network access (5)
📁 Filesystem access (3)
⚙️ External commands (5)
Quality Score
What You Can Build
Automated Form Submission
Fill and submit web forms for data entry, registration, or contact workflows. Supports validation feedback and error handling.
Web Data Extraction
Extract structured data from websites for research, price monitoring, or content aggregation. Captures screenshots alongside extracted content.
End-to-End Testing
Test web application workflows by simulating user interactions. Record sessions for debugging and documentation purposes.
Try These Prompts
Open the URL https://example.com and take a full-page screenshot. Show me what elements are available for interaction.
Navigate to the contact form at https://example.com/contact. Fill in the name field with 'John Doe', email with 'john@example.com', message with 'Hello', then submit the form. Confirm submission succeeded.
Open https://news.example.com and extract all article headlines from the homepage. Take a screenshot of the page and save the extracted text to a file. Keep the session open for follow-up queries.
Start a recorded browser session. Navigate to the login page, fill credentials from environment variables, submit and verify successful authentication. Navigate to the dashboard, extract the welcome message and user data, then close and return the video file.
Best Practices
- Always capture element references after navigation since refs invalidate on page changes
- Use environment variables for credentials and sensitive data, never hardcode in templates
- Enable video recording only for debugging to reduce resource overhead
- Close sessions explicitly or use cleanup traps to prevent resource leaks
Avoid
- Hardcoding usernames, passwords, or API keys directly in script files
- Assuming element refs persist across page loads without re-snapshot
- Logging or echoing sensitive values like passwords or session tokens
- Leaving browser sessions open after workflow completion