gau
The gau tool is integral to automated security workflows in Canva, providing capabilities to fetch known URLs from various sources such as AlienVault's OTX, Wayback Machine, Common Crawl, and URLScan. It enhances discovery phases by aggregating potentially valuable URL data that can aid in reconnaissance and vulnerability assessment.
Ideal Use Cases & Fit
gau excels in scenarios requiring extensive URL discovery during the reconnaissance phase of security assessments. Typical input involves a list of domains from which to extract URLs. This tool is ideal for users needing to uncover historical and current URLs for vulnerability testing or web content analysis. It may not be suitable for environments with strict data handling policies that prevent the querying of external services.
Value in Workflows
gau adds significant value at the initial stages of security assessments by enriching the dataset of target URLs. It facilitates early reconnaissance and feeds into further analysis or automated testing phases. By integrating gau within workflows, security teams can streamline data collection, save time on manual lookups, and ensure comprehensive coverage of potential attack surfaces.
Input Data
The gau tool expects input data in the following format:
- Type: file
- Format: newline-separated domains
- Function: target
- Required: true
Example:example.com
vulnweb.com
Configuration
- blacklist: Specifies extensions to skip during URL fetching.
- filter-code: Defines status codes to filter out from results.
- from: Establishes a date from which URLs should be fetched (format: YYYYMM).
- filter-type: Identifies mime-types to exclude in the output.
- filter-params: Controls whether to remove different parameters of the same endpoint (default is false).
- json: Determines the output format, defaulting to JSON.
- match-code: Lists status codes that must be matched in the results.
- match-type: Specifies mime-types that must be matched.
- providers: Identifies which data sources to use for URL fetching (wayback, commoncrawl, otx, urlscan).
- proxy: Sets the HTTP proxy to be utilized (required).
- retries: Configures the number of retries for HTTP client requests (default is 10).
- timeout: Specifies the timeout duration for HTTP requests in seconds (default is 60).
- subdomains: Indicates whether to include subdomains of the target domain (default is false).
- to: Establishes a date up to which URLs should be fetched (format: YYYYMM). Updated: 2026-02-10