Skip to main content

Data Transformation Agent Node Guide

Overview

The Data Transformation Agent Node is a specialized AI-powered component that automatically converts raw outputs into integration-ready formats. Unlike regular Script Nodes or Scripting Agent Nodes, Data Transformation Agent Nodes are specifically designed to bridge the gap between your workflow data and external integrations expected formats like Jira, Slack, or other services by intelligently mapping input data fields to target integration schemas.

Use Cases

  • Integration Data Preparation: Transform vulnerability scan results into Jira ticket format
  • Smart Field Mapping: Automatically map security findings to Slack message parameters
  • Schema Compliance: Ensure data meets exact integration requirements
  • Multi-Tool Aggregation: Combine outputs from multiple security tools for integration consumption
  • Automated Reporting: Convert technical data into business-friendly integration formats
  • API Data Formatting: Prepare data for integration consumption with proper field mapping

How Data Transformation Agent Nodes Work

The Data Transformation Agent Node uses an advanced multi-stage agentic AI workflow:

  1. JSON Pattern Extraction: Programmatically analyzes input data structure and formats
  2. Input Analysis: AI semantic understanding of what the data represents
  3. Target Schema Resolution: Retrieves integration schema parameters from the target node
  4. Data Correlation: AI-powered intelligent field mapping between source and target
  5. Script Generation: Creates Python transformation scripts with exact field mappings

Key Differences from Scripting Agent Nodes

FeatureScripting AgentData Transformation Agent
PurposeGeneral data processingIntegration-specific transformation
Target AwarenessNo target contextAnalyzes downstream integration node
Field MappingManual prompt-basedAutomatic schema-driven mapping
Schema ValidationNoneValidates against target integration node
Output FormatFlexibleIntegration-compliant result JSON

Creating a Data Transformation Agent Node

Basic Setup

CRITICAL REQUIREMENTS:

  • Input Connection: Must be connected to at least one upstream node (Input, Operation, Script, Integration or any other node)
  • Output Connection: Must be connected to a downstream Integration Node with resource and operation PRE-selected
  1. Drag a Data Transformation Agent Node from the node palette onto your workflow canvas
  2. Connect it to upstream nodes (data input) - REQUIRED
  3. Connect it to a downstream Integration Node - REQUIRED
  4. Configure the Integration Node:
    • Select Integration Service (Jira, Slack, etc.)
    • Choose Resource Type (issue, channel, etc.)
    • Select Operation (create issue, post message, etc.)
  5. Write a transformation prompt describing your requirements
  6. The AI will automatically analyze the input data to match it with the target integration expected input

[SCREENSHOT: Data Transformation Agent Node connected between data source and integration]

Configuration Options

Node Properties

PropertyDescription
NameA descriptive name for the node
PromptNatural language description of the transformation requirements
Always Regenerate ScriptIf true, regenerates transformation on every execution

Advanced Properties

PropertyDescription
Generated ScriptThe Python transformation script created by the AI (can be edited and saved)
AI SummaryExplanation of the transformation logic (can not be modified)
Unmapped Input ParametersSource fields that couldn't be mapped to the target
Unmapped Target Required ParametersRequired target fields that were not mapped, and are required for the integration to function

Target Integration Requirements

Critical: Data Transformation Agent Nodes must be connected to a downstream Integration Node that is partially configured.

Required Integration Node Configuration

The downstream Integration Node must have ALL of these configured:

  1. Integration Service selected (e.g., Jira, Slack, OpenCTI, Outlook)
  2. Resource Type specified (e.g., issue, channel, message)
  3. Operation chosen (e.g., create_issue, post_message, create_object)
  4. Credentials configured for the service

Without this partial configuration, the Data Transformation Agent cannot:

  • Retrieve the target schema
  • Perform field mapping validation
  • Generate appropriate transformation scripts

How Schema Resolution Works

Once properly connected, the system will:

  1. Automatically detect the target integration service, resource, and operation
  2. Retrieve the accurate schema for the specified integration parameters
  3. Validate field mappings against actual integration API requirements
  4. Generate compliant output that the integration can consume directly

Supported Target Integrations

Currently supports transformation for Integration Nodes:

  • Integration Services: Jira, Slack, OpenCTI, Outlook, CrowdStrike, etc.

Writing Effective Transformation Prompts

Prompt Best Practices

  1. Describe the Business Logic: Explain how security data should be interpreted
  2. Specify Priorities: Mention which fields are most important
  3. Handle Edge Cases: Describe how to deal with missing or malformed data

Example Prompts

Vulnerability to Jira Ticket

Transform vulnerability scan results into Jira tickets. Use the vulnerability name as the summary, 
combine description and remediation as the ticket description. Map severity levels:
Critical/High → High priority, Medium → Medium priority, Low → Low priority.

Security Findings to Slack Alert

Create Slack security alerts from the scan findings. Include the affected host, 
vulnerability count by severity, and a summary of the most critical issues.
Format as an urgent security notification.

How Field Mapping Works

The AI workflow performs intelligent field mapping through several steps:

1. Schema Analysis

  • Retrieves the supported schema from the target integration
  • Identifies required vs optional fields
  • Understands field types and constraints

2. Semantic Mapping

The AI maps fields based on meaning, not just names:

  • vulnerability.severitypriority (with value transformation)
  • finding.descriptiondescription (direct mapping)
  • scan.timestampcreated_date (format transformation)

3. Intelligent Transformations

  • Array to String: ["port1", "port2"]"port1, port2"
  • Severity Mapping: "critical""1" (for priority levels)
  • Data Aggregation: Multiple findings → Summary statistics
  • Format Conversion: Timestamps, URLs, etc.

4. Validation and Feedback

  • Unmapped Fields: Input data that couldn't be mapped
  • Missing Required: Target fields with no source data
  • Confidence Scores: AI confidence in each mapping

Generated Script Structure

All transformation scripts follow this pattern:

#!/usr/bin/env python3
import argparse
import json

def main():
parser = argparse.ArgumentParser(description='Data transformation script')
parser.add_argument('-i', help='Comma-separated input file paths')
parser.add_argument('-o', required=True, help='Output file path')
args = parser.parse_args()

# Load and parse input data
input_data = load_input_files(args.i)

# Apply field mappings with transformations
transformed_data = {
"summary": transform_summary(input_data),
"priority": map_severity_to_priority(input_data.get("severity")),
"description": combine_description_fields(input_data),
# ... other mapped fields
}

# Write single JSON object output
with open(args.o, 'w') as f:
json.dump(transformed_data, f, indent=2)

if __name__ == "__main__":
main()

Script Regeneration Options

Always Regenerate Script (Default: False)

  • True: Regenerates transformation script on every execution with fresh schema analysis
  • False: Reuses existing transformation script for consistent behavior

When to Use Each Option

Always Regenerate = True:

  • Target integration schema might change
  • Input data structure varies between executions
  • You're refining transformation logic
  • Development and testing phase

Always Regenerate = False:

  • Stable integration schemas and data formats
  • Production workflows requiring consistent behavior
  • Performance-critical workflows
  • Validated transformation logic

Integration with Other Nodes

Upstream Node Compatibility

Data Transformation Agent Nodes can process output from:

  • Operation Nodes: Vulnerability scanners, security tools
  • Input Nodes: Uploaded security reports
  • Integration Nodes: Data from other services
  • Script/Agent Nodes: Processed security data

Downstream Node Requirements

Currently the following nodes are supported:

  • Integration Nodes: The target integration that will consume the transformed data

Performance Considerations

AI Processing Time

  • Initial transformation generation: 60-120 seconds
  • Subsequent executions (with reuse): No additional AI processing time

Best Practices

Workflow Design

  • Always connect to Integration Node: Required for schema analysis
  • Test with real data: Validate transformations with actual security tool outputs
  • Monitor unmapped fields: Review what data isn't being used
  • Handle missing required fields: Plan for incomplete input data

Prompt Writing

  • Focus on business logic: Describe the transformation intent, not implementation
  • Specify aggregation: How to handle multiple findings or data points
  • Consider data quality: How to handle missing or malformed input
  • Integration context: Reference the target integration's purpose

Troubleshooting

IssueResolution
No target integration detectedEnsure downstream Integration Node is connected
Schema resolution failedVerify Integration Node configuration is complete
Field mapping errorsReview unmapped fields and adjust prompt
Transformation script errorsCheck input data format and integration requirements
Missing required fieldsProvide defaults or modify input data collection

Common Error Patterns

"No target node found for data transformation":

  • Connect a downstream Integration Node
  • Ensure Integration Node is properly configured

"Target integration node must have a resource specified":

  • Partial Integration Node Configuration Required:
    • Integration Service must be selected
    • Resource Type must be specified
    • Operation must be chosen
  • Verify ALL configuration elements are properly set

"Field mapping validation failed":

  • Review the unmapped fields in the AI Summary
  • Adjust prompt to handle missing data scenarios
  • Check if input data structure matches expectations

Example Configurations

Example 1: Nmap to Jira Security Issue

Input: Nmap port scan results Target: Jira issue creation Prompt:

Create Jira security tickets for open ports found in the network scan. 
Use the host IP as the summary, list all open ports in the description,
and set priority based on the number of open ports: >10 ports = High,
5-10 = Medium, <5 = Low priority.

Generated Mapping:

  • hostsummary
  • open_ports[]description (array_to_string)
  • port_countpriority (with custom logic)

Example 2: Vulnerability Scan to Slack Alert

Input: Vulnerability assessment report Target: Slack channel message Prompt:

Send security alerts to Slack for critical and high vulnerabilities. 
Include affected systems count, top 3 vulnerability types, and
overall risk assessment. Format as an urgent security notification.

Generated Features:

  • Aggregates multiple vulnerabilities into summary statistics
  • Filters by severity level
  • Creates business-friendly messaging
  • Includes actionable information

Example 3: Multi-Source Security Dashboard

Input: Combined subdomain enumeration + vulnerability scan Target: Custom integration API Prompt:

Create a security posture report combining discovery and vulnerability data. 
Show total assets discovered, vulnerability distribution by severity,
and risk score calculation. Include remediation priorities.

AI Capabilities Demonstrated:

  • Cross-correlates data from multiple sources
  • Performs statistical analysis and aggregation
  • Creates executive-level summary information
  • Maintains data lineage and confidence scores