wget MCP Setup Guide ==================== Instructions for setting up `mcp-server-webcrawl `_ with `wget `_. This allows your LLM (e.g. Claude Desktop) to search content and metadata from websites you've crawled. .. raw:: html Follow along with the video, or the step-action guide. Requirements ------------ Before you begin, ensure you have: - `Claude Desktop `_ installed - `Python `_ 3.10 or later installed - Basic familiarity with command line interfaces - wget installed (macOS users can install via Homebrew, Windows users need WSL/Ubuntu) Installation Steps ------------------ 1. Install MCP Server Web Crawl ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Open your terminal or command line and install the package: .. code-block:: bash pip install mcp-server-webcrawl Verify installation was successful by checking the version: .. code-block:: bash mcp-server-webcrawl --version 2. Configure Claude Desktop ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Open Claude Desktop 2. Go to **File → Settings → Developer → Edit Config** 3. Add the following configuration (modify paths as needed): .. code-block:: json { "mcpServers": { "webcrawl": { "command": "/path/to/mcp-server-webcrawl", "args": ["--crawler", "wget", "--datasrc", "/path/to/wget/archives/"] } } } .. note:: - On Windows, use ``"mcp-server-webcrawl"`` as the command - On macOS, use the absolute path (output of ``which mcp-server-webcrawl``) - Change ``/path/to/wget/archives/`` to your actual directory path 4. Save the file and **completely exit** Claude Desktop (not just close the window) 5. Restart Claude Desktop 3. Crawl Websites with wget ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Open Terminal (macOS) or Ubuntu/WSL (Windows) 2. Navigate to your target directory for storing crawls 3. Run wget with the mirror option: .. code-block:: bash wget --mirror https://example.com 4. Verify and Use ~~~~~~~~~~~~~~~~~ 1. In Claude Desktop, you should now see an MCP tool option under Search and Tools 2. Ask Claude to list your crawled sites: .. code-block:: text Can you list the crawled sites available? 3. Try searching content from your crawls: .. code-block:: text Can you find information about [topic] on [crawled site]? Troubleshooting --------------- - If Claude doesn't show MCP tools after restart, verify your configuration file is correctly formatted - Ensure Python and mcp-server-webcrawl are properly installed, and on PATH or using absolute paths - Check that your crawl directory path in the configuration is correct - Remember that the first time you use a function, Claude will ask for permission - Indexing for file-based archives (wget included) requires build time on first search, time is dependent on archive size For more details, including API documentation and other crawler options, visit the `mcp-server-webcrawl documentation `_.