Skip to main content

onion-keyword-finder

The onion-keyword-finder is a specialized tool designed to crawl onion websites for keyword matching within automated security workflows. This tool enhances the ability to discover and analyze hidden services on the Tor network, making it invaluable for security assessments and intelligence gathering.

Ideal Use Cases & Fit

This tool is optimal for scenarios requiring reconnaissance on onion sites where specific keywords must be located, such as during investigations of illegal marketplaces or content hosted on hidden services. It performs best with a curated list of onion URLs as input, allowing security teams to uncover potential threats or vulnerabilities tied to particular keywords. However, it may not be suitable for broad, non-targeted web crawling sessions due to its specificity.

Value in Workflows

Integrating the onion-keyword-finder into security workflows can enhance the data discovery phase, enabling early reconnaissance efforts on onion sites. As a component of post-processing, it can provide actionable insights from crawled data, revealing risky entities and aiding in threat analysis. This tool facilitates automated information gathering, making processes more efficient while ensuring that critical data is not overlooked.

Input Data

The tool requires input in the form of a file containing newline-separated onion URLs to be scraped. This is vital for directing the tool to the exact targets for keyword matching.

Example:

https://example.com
https://example2.onion

Configuration

  • keywords: A space-separated list of keywords to match against the content of the crawled pages.
  • max-depth: An integer defining the maximum depth to which the crawler should go when exploring links from the input URLs.
  • timeout: An integer representing the maximum time (in seconds) the tool should wait for a response from a target URL before timing out.
  • save-page-content: A boolean that determines whether to save the full content of pages that match the specified keywords in the output.