Skip to main content

urlscan.io Integration Guide

Overview

The urlscan.io integration allows your NINA workflows to seamlessly connect with urlscan.io for URL scanning, security analysis, and threat intelligence. This integration enables you to submit URLs for scanning, retrieve detailed analysis results, download screenshots and DOM snapshots, and search through existing scan data directly from your workflows.

Status

We currently support comprehensive URL scanning and analysis operations:

✅ Currently Supported:

  • Scan Submission: Submit URLs for scanning with configurable options (visibility, country, tags, custom headers)
  • Result Retrieval: Get detailed scan results including security analysis and metadata
  • Screenshot Download: Retrieve PNG screenshots of scanned pages
  • DOM Snapshot: Download HTML DOM snapshots of scanned pages
  • Polling: Poll for scan completion with configurable timeout and intervals
  • Search: Query existing scan results using ElasticSearch syntax with pagination
  • Direct Result Access: Fast retrieval of scan result data by UUID

✅ Advanced Features:

  • Custom User-Agent: Set custom User-Agent strings for scans
  • HTTP Referer: Configure custom Referer headers
  • Geographic Scanning: Choose scanning location by country code
  • Privacy Controls: Configure scan visibility (public/private)
  • Safety Override: Override safety checks for URLs with potential PII
  • Tag Management: Associate custom tags with scans for organization
  • Validation: Built-in URL and UUID format validation

urlscan.io provides comprehensive URL analysis including:

  • Security Analysis: Malware detection, phishing identification, suspicious behavior analysis
  • Network Analysis: HTTP requests, redirects, DNS resolution, SSL certificate details
  • Content Analysis: Page structure, embedded resources, JavaScript behavior
  • Performance Metrics: Load times, resource sizes, network timing
  • Visual Analysis: Screenshots, page rendering analysis

Credential Configuration

Before using the urlscan.io integration in your workflows, you need to configure credentials for authentication. The NINA platform supports API key authentication for urlscan.io:

Authentication Method

API Key

For access to your urlscan.io account:

FieldDescriptionExample
API KeyYour urlscan.io API key from your account settingsabc123def456ghi789...
Base URLBase URL for urlscan.io API (optional)https://urlscan.io

How to get your API Key:

  1. Sign up or log in to your urlscan.io account at https://urlscan.io
  2. Navigate to your Account Settings or API section
  3. Generate or copy your existing API key
  4. Note: Some operations may work without an API key (public submissions), but an API key is required for private scans and enhanced features

Creating a urlscan.io Credential

  1. Navigate to the Credentials section in NINA

  2. Click Add New Credential

  3. Fill in the credential details:

    • Name: A descriptive name (e.g., "urlscan.io Production")
    • Description: Optional details about the credential's purpose
    • Integration Service: Select "urlscan.io"
    • Auth Type: Choose "API Key"
    • API Key: Enter your urlscan.io API key
    • Base URL: Leave default (https://urlscan.io) unless using a custom instance
  4. Click Test Connection to verify credentials

  5. Click Save to store the credential

Supported Resources and Operations

The urlscan.io integration supports the following resources and operations:

Scan

OperationDescription
SubmitSubmit a URL for scanning with configurable options
Get ResultRetrieve detailed scan results by UUID
Get ScreenshotDownload PNG screenshot of the scanned page
Get DOMDownload DOM snapshot of the scanned page
PollPoll for scan completion with configurable timeout
OperationDescription
QuerySearch existing scans using ElasticSearch syntax

Result

OperationDescription
GetDirect access to scan result data by UUID

Parameter Merging and Templating

The urlscan.io integration takes full advantage of NINA's parameter merging and templating capabilities:

Parameter Sources (in order of precedence)

  1. Node Parameters: Parameters configured directly in the urlscan.io Integration Node
  2. Extracted Parameters: Parameters automatically extracted from the input data
  3. Input Data: The complete input data from upstream nodes

When a urlscan.io Integration Node executes:

  • It combines parameters from all sources
  • Node parameters take precedence over extracted parameters
  • Template variables within parameters are processed (using {{variable_name}} syntax)
  • The combined parameters are used to execute the urlscan.io operation

Example: Submitting URLs for Scanning

Basic URL Submission

Below is an example of submitting a URL for scanning using the Integration Node:

Node Configuration:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "submit",
"parameters": {
"url": "https://example.com",
"visibility": "public",
"tags": ["security-check", "monitoring"]
}
}

Advanced URL Submission with Template Variables

You can use template variables to dynamically insert values from input data:

Input Data from Previous Node:

{
"suspicious_url": "https://suspicious-domain.com/login",
"investigation": {
"case_id": "SEC-2024-001",
"priority": "high",
"analyst": "john.doe"
},
"scan_config": {
"country": "US",
"visibility": "private"
}
}

Node Configuration with Template Variables:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "submit",
"parameters": {
"url": "{{suspicious_url}}",
"visibility": "{{scan_config.visibility}}",
"country": "{{scan_config.country}}",
"tags": ["case-{{investigation.case_id}}", "{{investigation.priority}}-priority", "analyst-{{investigation.analyst}}"],
"customAgent": "Security-Scanner/1.0"
}
}

Result:

This will submit a scan with:

  • URL: "https://suspicious-domain.com/login"
  • Visibility: "private"
  • Country: "US"
  • Tags: ["case-SEC-2024-001", "high-priority", "analyst-john.doe"]
  • Custom User-Agent: "Security-Scanner/1.0"

Example: Retrieving Scan Results

Getting Scan Results with Polling

Node Configuration:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "poll",
"parameters": {
"uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"timeout": 180,
"interval": 10
}
}

Getting Results with Template Variables

Input Data:

{
"scan_uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"scan_settings": {
"max_wait_time": 120,
"check_interval": 5
}
}

Node Configuration:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "poll",
"parameters": {
"uuid": "{{scan_uuid}}",
"timeout": "{{scan_settings.max_wait_time}}",
"interval": "{{scan_settings.check_interval}}"
}
}

Example: Downloading Screenshots and DOM

Getting Screenshots

Node Configuration:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "getScreenshot",
"parameters": {
"uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
}

Getting DOM Snapshots

Node Configuration:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "getDom",
"parameters": {
"uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
}

Batch Processing Multiple Scans

Input Data:

{
"completed_scans": [
{
"uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"url": "https://example1.com"
},
{
"uuid": "b2c3d4e5-f6g7-8901-bcde-f23456789012",
"url": "https://example2.com"
}
]
}

Node Configuration (for each scan):

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "getResult",
"parameters": {
"uuid": "{{uuid}}"
}
}

Example: Searching Existing Scans

Basic Search Query

Node Configuration:

{
"integration_service": "urlscan",
"resource": "search",
"operation": "query",
"parameters": {
"query": "domain:example.com",
"size": 50
}
}

Advanced Search with Complex Queries

Input Data:

{
"threat_hunt": {
"domain": "suspicious-domain.com",
"date_range": "2024-01-01",
"country": "RU"
}
}

Node Configuration:

{
"integration_service": "urlscan",
"resource": "search",
"operation": "query",
"parameters": {
"query": "domain:{{threat_hunt.domain}} AND date:>{{threat_hunt.date_range}} AND country:{{threat_hunt.country}}",
"size": 100
}
}

Search with Pagination

Node Configuration:

{
"integration_service": "urlscan",
"resource": "search",
"operation": "query",
"parameters": {
"query": "malicious:true",
"size": 1000,
"searchAfter": "1234567890"
}
}

Complete Workflow Example

Automated Threat Intelligence Pipeline

Here's a complete workflow that demonstrates multiple urlscan.io operations:

  1. Submit suspicious URLs for scanning
  2. Poll for completion
  3. Retrieve detailed results
  4. Download screenshots for evidence
  5. Search for related threats

Workflow Configuration:

# Step 1: Submit URL for scanning
- integration_service: urlscan
resource: scan
operation: submit
parameters:
url: "{{suspicious_url}}"
visibility: "private"
tags: ["threat-intel", "{{case_id}}"]
country: "US"

# Step 2: Poll for results
- integration_service: urlscan
resource: scan
operation: poll
parameters:
uuid: "{{scan_uuid}}"
timeout: 180
interval: 15

# Step 3: Get detailed results
- integration_service: urlscan
resource: scan
operation: getResult
parameters:
uuid: "{{scan_uuid}}"

# Step 4: Download screenshot
- integration_service: urlscan
resource: scan
operation: getScreenshot
parameters:
uuid: "{{scan_uuid}}"

# Step 5: Search for related threats
- integration_service: urlscan
resource: search
operation: query
parameters:
query: "domain:{{extracted_domain}} AND malicious:true"
size: 100

Troubleshooting

Complete workflow showing urlscan.io integration nodes connected with other node types

IssueResolution
Authentication failuresVerify your API key is correct and active. Some operations work without API keys, but private scans require authentication.
"Invalid URL format" errorsEnsure URLs include the protocol (http:// or https://). urlscan.io requires properly formatted URLs.
"UUID not found" errorsVerify the scan UUID is correct and the scan exists. UUIDs must be in standard format (8-4-4-4-12 characters).
Scan timeout issuesIncrease the polling timeout parameter. Some complex pages take longer to scan completely.
Rate limitingurlscan.io has rate limits. Implement delays between submissions or use polling instead of rapid requests.
Private scan access deniedEnsure you have an API key and proper permissions for private scans.
Search query syntax errorsCheck ElasticSearch query syntax. Common fields include domain, url, country, date, malicious.
Screenshot/DOM not availableSome scans may not generate screenshots or DOM snapshots. Check if the scan completed successfully first.

Best Practices

  1. Use Meaningful Tags: Add descriptive tags to organize and categorize your scans for easier searching and management.

  2. Leverage Template Variables: Use {{variable_name}} syntax to dynamically insert values from input data.

  3. Configure Appropriate Timeouts: Set realistic timeout values for polling based on your use case - complex pages may take longer to scan.

  4. Use Private Scans for Sensitive URLs: For internal or sensitive URLs, use private visibility to prevent public access to scan results.

  5. Implement Proper Error Handling: Handle cases where scans fail, timeout, or don't produce expected results.

  6. Search Before Scanning: Check if a URL has been recently scanned using the search operation to avoid duplicate scans.

  7. Use Geographic Scanning Strategically: Choose scan locations based on your analysis needs - different countries may provide different perspectives.

  8. Batch Process When Possible: For multiple URLs, consider batching operations efficiently rather than individual sequential scans.

  9. Monitor Rate Limits: Be aware of urlscan.io rate limits and implement appropriate delays between operations.

  10. Secure Your API Keys: Keep your urlscan.io API keys secure using the system's built-in credential management.

  11. Validate Input URLs: Use the built-in URL validation or implement additional checks to ensure URL format correctness.

  12. Store Results Appropriately: Consider how long you need to retain scan results and screenshots for your use case.

Advanced Use Cases

Automated Security Monitoring

Create workflows that continuously monitor URLs for security threats:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "submit",
"parameters": {
"url": "{{monitored_url}}",
"visibility": "private",
"tags": ["monitoring", "{{service_name}}", "automated"],
"customAgent": "Security-Monitor/1.0"
}
}

Threat Intelligence Enrichment

Enhance threat intelligence with detailed URL analysis:

{
"integration_service": "urlscan",
"resource": "search",
"operation": "query",
"parameters": {
"query": "domain:{{suspicious_domain}} AND (malicious:true OR suspicious:true)",
"size": 50
}
}

Incident Response Evidence Collection

Automatically collect screenshots and DOM snapshots for incident response:

{
"integration_service": "urlscan",
"resource": "scan",
"operation": "submit",
"parameters": {
"url": "{{incident_url}}",
"visibility": "private",
"tags": ["incident-{{incident_id}}", "evidence", "{{analyst_name}}"],
"overrideSafety": true
}
}

Phishing Campaign Analysis

Analyze phishing campaigns by scanning and correlating multiple URLs:

{
"integration_service": "urlscan",
"resource": "search",
"operation": "query",
"parameters": {
"query": "page.domain:{{campaign_domain}} AND date:>{{campaign_start_date}}",
"size": 500
}
}

Brand Protection Monitoring

Monitor for fraudulent or suspicious use of your brand:

{
"integration_service": "urlscan",
"resource": "search",
"operation": "query",
"parameters": {
"query": "page.url:*{{brand_name}}* AND NOT domain:{{legitimate_domains}}",
"size": 100
}
}

ElasticSearch Query Examples

urlscan.io uses ElasticSearch for search queries. Here are some useful query patterns:

Domain-based Searches

domain:example.com
domain:*.example.com

URL Pattern Searches

page.url:*login*
page.url:https://example.com/path*

Security Status Searches

malicious:true
suspicious:true
stats.malicious:>0

Date Range Searches

date:>2024-01-01
date:[2024-01-01 TO 2024-01-31]

Country-based Searches

country:US
country:(US OR CA OR GB)

Combined Complex Searches

domain:*.suspicious.com AND malicious:true AND date:>2024-01-01
page.url:*login* AND country:RU AND stats.malicious:>0

For more advanced query syntax and capabilities, refer to the urlscan.io API documentation and ElasticSearch query documentation.