wget MCP Setup Guide
Instructions for setting up mcp-server-webcrawl with wget. This allows your LLM (e.g. Claude Desktop) to search content and metadata from websites you’ve crawled.
Follow along with the video, or the step-action guide.
Requirements
Before you begin, ensure you have:
Claude Desktop installed
Python 3.10 or later installed
Basic familiarity with command line interfaces
wget installed (macOS users can install via Homebrew, Windows users need WSL/Ubuntu)
Installation Steps
1. Install MCP Server Web Crawl
Open your terminal or command line and install the package:
pip install mcp-server-webcrawl
Verify installation was successful by checking the version:
mcp-server-webcrawl --version
2. Configure Claude Desktop
Open Claude Desktop
Go to File → Settings → Developer → Edit Config
Add the following configuration (modify paths as needed):
{
"mcpServers": {
"webcrawl": {
"command": "/path/to/mcp-server-webcrawl",
"args": ["--crawler", "wget", "--datasrc",
"/path/to/wget/archives/"]
}
}
}
Note
On Windows, use
"mcp-server-webcrawl"
as the commandOn macOS, use the absolute path (output of
which mcp-server-webcrawl
)Change
/path/to/wget/archives/
to your actual directory path
Save the file and completely exit Claude Desktop (not just close the window)
Restart Claude Desktop
3. Crawl Websites with wget
Open Terminal (macOS) or Ubuntu/WSL (Windows)
Navigate to your target directory for storing crawls
Run wget with the mirror option:
wget --mirror https://example.com
4. Verify and Use
In Claude Desktop, you should now see an MCP tool option under Search and Tools
Ask Claude to list your crawled sites:
Can you list the crawled sites available?
Try searching content from your crawls:
Can you find information about [topic] on [crawled site]?
Troubleshooting
If Claude doesn’t show MCP tools after restart, verify your configuration file is correctly formatted
Ensure Python and mcp-server-webcrawl are properly installed, and on PATH or using absolute paths
Check that your crawl directory path in the configuration is correct
Remember that the first time you use a function, Claude will ask for permission
Indexing for file-based archives (wget included) requires build time on first search, time is dependent on archive size
For more details, including API documentation and other crawler options, visit the mcp-server-webcrawl documentation.