Web Scraper
A purpose-built fetch tool, separate from generic http_request. It exists because the agent doesn't want raw HTML - it wants the article.
What it does
- Fetches a URL
- Strips boilerplate (nav, ads, footer, scripts)
- Returns clean text the agent can reason over
Guardrails
- Caps response at 1 MB - large pages get truncated
- 20-second timeout - slow servers don't stall the conversation
- Subject to proxy and URL-guard rules
What it's good for
- Reading articles, blog posts, docs pages, GitHub READMEs without the noise
- Following up on a Web Search result
- Summarising a single page on demand
See also
- Web Search - Find URLs to feed into the scraper
- Token Compression - What trims long pages before they hit the model