mcp_server_webcrawl.crawlers.interrobot package
Submodules
mcp_server_webcrawl.crawlers.interrobot.adapter module
- iso_to_datetime(dt_string)[source]
Convert ISO string to datetime.
python<=3.10 struggles with zulu and fractions of seconds, will throw. smooth out the iso string, second precision isn’t key here
- get_sites(datasrc, ids=None, fields=None)[source]
Get sites based on the provided parameters.
- Parameters:
datasrc (Path) – Path to the database
ids – Optional list of site IDs
fields – List of fields to include in response
- Returns:
List of SiteResult objects
- Return type:
- get_resources(datasrc, ids=None, sites=None, query='', types=None, fields=None, statuses=None, sort=None, limit=20, offset=0)[source]
Get resources based on the provided parameters.
- Args (all query/WHERE args ANDed):
datasrc: Path to the database ids: Optional list of resource IDs site: Optional project ID to filter by site query: Search query string for FTS5 search types: Optional filter for specific resource types fields: List of fields to include in response statuses: List of HTTP statuses to include in response sort: Sort order for results limit: Maximum number of results to return offset: Number of results to skip for pagination
- Returns:
List of ResourceResult objects
Total count of matching resources
- Return type:
Tuple containing
- Parameters:
mcp_server_webcrawl.crawlers.interrobot.crawler module
- class InterroBotCrawler[source]
Bases:
BaseCrawler
A crawler implementation for InterroBot data sources. Provides functionality for accessing and searching web content from InterroBot.
Initialize the InterroBotCrawler with a data source path.
- Parameters:
datasrc – Path to the data source
- __init__(datasrc)[source]
Initialize the InterroBotCrawler with a data source path.
- Parameters:
datasrc – Path to the data source
- async mcp_list_tools()[source]
List available tools for this crawler.
- Returns:
List of Tool objects
- Return type:
list[Tool]
- get_sites_api(ids=None, fields=None)[source]
Retrieve site information from the InterroBot data source.
- Parameters:
- Returns:
API response object containing site information
- Return type:
- get_resources_api(ids=None, sites=None, query='', types=None, fields=None, statuses=None, sort=None, limit=20, offset=0)[source]
Get resources in JSON format based on the provided parameters.
- Parameters:
ids (list[int] | None) – Optional list of resource ids to retrieve specific resources directly
sites (list[int] | None) – Optional list of project ids to filter search results to a specific site
query (str) – Search query string
types (list[str] | None) – Optional filter for specific resource types
fields (list[str] | None) – List of additional fields to include in the response
statuses (list[int] | None) – Optional list of HTTP status codes to filter results
sort (str | None) – Sort order for results
limit (int) – Maximum number of results to return
offset (int) – Number of results to skip for pagination
- Returns:
JSON string containing the results
- Return type:
mcp_server_webcrawl.crawlers.interrobot.tests module
- class InterroBotTests[source]
Bases:
BaseCrawlerTests
Test suite for the InterroBot crawler implementation.
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.