wget MCP Setup Guide

Instructions for setting up mcp-server-webcrawl with wget. This allows your LLM (e.g. Claude Desktop) to search content and metadata from websites you’ve crawled.

Follow along with the video, or the step-action guide.

Requirements

Before you begin, ensure you have:

Claude Desktop installed
Python 3.10 or later installed
Basic familiarity with command line interfaces
wget installed (macOS users can install via Homebrew, Windows users need WSL/Ubuntu)

Installation Steps

1. Install MCP Server Web Crawl

Open your terminal or command line and install the package:

pip install mcp-server-webcrawl

Verify installation was successful by checking the version:

mcp-server-webcrawl --version

2. Configure Claude Desktop

Open Claude Desktop
Go to File → Settings → Developer → Edit Config
Add the following configuration (modify paths as needed):

{
  "mcpServers": {
    "webcrawl": {
      "command": "/path/to/mcp-server-webcrawl",
      "args": ["--crawler", "wget", "--datasrc",
        "/path/to/wget/archives/"]
    }
  }
}

Note

On Windows, use "mcp-server-webcrawl" as the command
On macOS, use the absolute path (output of which mcp-server-webcrawl)
Change /path/to/wget/archives/ to your actual directory path

Save the file and completely exit Claude Desktop (not just close the window)
Restart Claude Desktop

3. Crawl Websites with wget

Open Terminal (macOS) or Ubuntu/WSL (Windows)
Navigate to your target directory for storing crawls
Run wget with the mirror option:

wget --mirror https://example.com

4. Verify and Use

In Claude Desktop, you should now see an MCP tool option under Search and Tools
Ask Claude to list your crawled sites:

Can you list the crawled sites available?

Try searching content from your crawls:

Can you find information about [topic] on [crawled site]?

Troubleshooting

If Claude doesn’t show MCP tools after restart, verify your configuration file is correctly formatted
Ensure Python and mcp-server-webcrawl are properly installed, and on PATH or using absolute paths
Check that your crawl directory path in the configuration is correct
Remember that the first time you use a function, Claude will ask for permission
Indexing for file-based archives (wget included) requires build time on first search, time is dependent on archive size

For more details, including API documentation and other crawler options, visit the mcp-server-webcrawl documentation.