SiteOne MCP Setup Guide
Instructions for setting up mcp-server-webcrawl with SiteOne crawler. This allows your LLM (e.g. Claude Desktop) to search content and metadata from websites you’ve crawled using SiteOne.
Follow along with the video, or the step-action guide below.
Requirements
Before you begin, ensure you have:
Claude Desktop installed
Python 3.10 or later installed
SiteOne Crawler installed
Basic familiarity with command line interfaces
What is SiteOne?
SiteOne is a graphical web crawler that offers:
User-friendly desktop interface for setting up and managing crawls
Offline website generation capabilities
Comprehensive crawl reporting
Intuitive controls for non-technical users
Installation Steps
1. Install MCP Server Web Crawl
Open your terminal or command line and install the package:
pip install mcp-server-webcrawl
Verify installation was successful:
mcp-server-webcrawl --version
2. Create Crawls with SiteOne
Open SiteOne Crawler application
Enter a URL to crawl (e.g., example.com)
Important: Check the “Generate offline website” option (this is required for MCP integration)
Click the start button to begin crawling
Repeat for additional sites as needed (e.g., pragmar.com)
Note the directory where SiteOne is storing the generated offline content (this is shown in the application)
3. Configure Claude Desktop
Open Claude Desktop
Go to File → Settings → Developer → Edit Config
Add the following configuration (modify paths as needed):
{
"mcpServers": {
"webcrawl": {
"command": "/path/to/mcp-server-webcrawl",
"args": ["--crawler", "siteone", "--datasrc",
"/path/to/siteone/archives/"]
}
}
}
Note
On Windows, use
"mcp-server-webcrawl"
as the commandOn macOS, use the absolute path (output of
which mcp-server-webcrawl
)Change
/path/to/siteone/archives/
to the actual path where SiteOne stores offline website content
Save the file and completely exit Claude Desktop (not just close the window)
Restart Claude Desktop
4. Verify and Use
In Claude Desktop, you should now see MCP tools available under Search and Tools
Ask Claude to list your crawled sites:
Can you list the crawled sites available?
Try searching content from your crawls:
Can you find information about [topic] on [crawled site]?
Explore specific topics on your crawled sites:
I'm interested in AppStat on pragmar.com. Can you tell me about it?
Troubleshooting
If Claude doesn’t show MCP tools after restart, verify your configuration file is correctly formatted
Ensure Python and mcp-server-webcrawl are properly installed
Check that your SiteOne archives path in the configuration is correct
Make sure the “Generate offline website” option was checked when creating crawls
Verify that each crawl completed successfully and files were saved to the expected location
Remember that the first time you use a function, Claude will ask for permission
For more details, including API documentation and other crawler options, visit the mcp-server-webcrawl documentation.