Data Transformation Agent Node Guide
Overview
The Data Transformation Agent Node is a specialized AI-powered component that automatically converts raw outputs into integration-ready formats. Unlike regular Script Nodes or Scripting Agent Nodes, Data Transformation Agent Nodes are specifically designed to bridge the gap between your workflow data and external integrations expected formats like Jira, Slack, or other services by intelligently mapping input data fields to target integration schemas.
Use Cases
- Integration Data Preparation: Transform vulnerability scan results into Jira ticket format
- Smart Field Mapping: Automatically map security findings to Slack message parameters
- Schema Compliance: Ensure data meets exact integration requirements
- Multi-Tool Aggregation: Combine outputs from multiple security tools for integration consumption
- Automated Reporting: Convert technical data into business-friendly integration formats
- API Data Formatting: Prepare data for integration consumption with proper field mapping
How Data Transformation Agent Nodes Work
The Data Transformation Agent Node uses an advanced multi-stage agentic AI workflow:
- JSON Pattern Extraction: Programmatically analyzes input data structure and formats
- Input Analysis: AI semantic understanding of what the data represents
- Target Schema Resolution: Retrieves integration schema parameters from the target node
- Data Correlation: AI-powered intelligent field mapping between source and target
- Script Generation: Creates Python transformation scripts with exact field mappings
Key Differences from Scripting Agent Nodes
| Feature | Scripting Agent | Data Transformation Agent |
|---|---|---|
| Purpose | General data processing | Integration-specific transformation |
| Target Awareness | No target context | Analyzes downstream integration node |
| Field Mapping | Manual prompt-based | Automatic schema-driven mapping |
| Schema Validation | None | Validates against target integration node |
| Output Format | Flexible | Integration-compliant result JSON |
Creating a Data Transformation Agent Node
Basic Setup
CRITICAL REQUIREMENTS:
- Input Connection: Must be connected to at least one upstream node (Input, Operation, Script, Integration or any other node)
- Output Connection: Must be connected to a downstream Integration Node with resource and operation PRE-selected
- Drag a Data Transformation Agent Node from the node palette onto your workflow canvas
- Connect it to upstream nodes (data input) - REQUIRED
- Connect it to a downstream Integration Node - REQUIRED
- Configure the Integration Node:
- Select Integration Service (Jira, Slack, etc.)
- Choose Resource Type (issue, channel, etc.)
- Select Operation (create issue, post message, etc.)
- Write a transformation prompt describing your requirements
- The AI will automatically analyze the input data to match it with the target integration expected input
[SCREENSHOT: Data Transformation Agent Node connected between data source and integration]
Configuration Options
Node Properties
| Property | Description |
|---|---|
| Name | A descriptive name for the node |
| Prompt | Natural language description of the transformation requirements |
| Always Regenerate Script | If true, regenerates transformation on every execution |
Advanced Properties
| Property | Description |
|---|---|
| Generated Script | The Python transformation script created by the AI (can be edited and saved) |
| AI Summary | Explanation of the transformation logic (can not be modified) |
| Unmapped Input Parameters | Source fields that couldn't be mapped to the target |
| Unmapped Target Required Parameters | Required target fields that were not mapped, and are required for the integration to function |
Target Integration Requirements
Critical: Data Transformation Agent Nodes must be connected to a downstream Integration Node that is partially configured.
Required Integration Node Configuration
The downstream Integration Node must have ALL of these configured:
- Integration Service selected (e.g., Jira, Slack, OpenCTI, Outlook)
- Resource Type specified (e.g., issue, channel, message)
- Operation chosen (e.g., create_issue, post_message, create_object)
- Credentials configured for the service
Without this partial configuration, the Data Transformation Agent cannot:
- Retrieve the target schema
- Perform field mapping validation
- Generate appropriate transformation scripts
How Schema Resolution Works
Once properly connected, the system will:
- Automatically detect the target integration service, resource, and operation
- Retrieve the accurate schema for the specified integration parameters
- Validate field mappings against actual integration API requirements
- Generate compliant output that the integration can consume directly
Supported Target Integrations
Currently supports transformation for Integration Nodes:
- Integration Services: Jira, Slack, OpenCTI, Outlook, CrowdStrike, etc.
Writing Effective Transformation Prompts
Prompt Best Practices
- Describe the Business Logic: Explain how security data should be interpreted
- Specify Priorities: Mention which fields are most important
- Handle Edge Cases: Describe how to deal with missing or malformed data
Example Prompts
Vulnerability to Jira Ticket
Transform vulnerability scan results into Jira tickets. Use the vulnerability name as the summary,
combine description and remediation as the ticket description. Map severity levels:
Critical/High → High priority, Medium → Medium priority, Low → Low priority.
Security Findings to Slack Alert
Create Slack security alerts from the scan findings. Include the affected host,
vulnerability count by severity, and a summary of the most critical issues.
Format as an urgent security notification.
How Field Mapping Works
The AI workflow performs intelligent field mapping through several steps:
1. Schema Analysis
- Retrieves the supported schema from the target integration
- Identifies required vs optional fields
- Understands field types and constraints
2. Semantic Mapping
The AI maps fields based on meaning, not just names:
vulnerability.severity→priority(with value transformation)finding.description→description(direct mapping)scan.timestamp→created_date(format transformation)
3. Intelligent Transformations
- Array to String:
["port1", "port2"]→"port1, port2" - Severity Mapping:
"critical"→"1"(for priority levels) - Data Aggregation: Multiple findings → Summary statistics
- Format Conversion: Timestamps, URLs, etc.
4. Validation and Feedback
- Unmapped Fields: Input data that couldn't be mapped
- Missing Required: Target fields with no source data
- Confidence Scores: AI confidence in each mapping
Generated Script Structure
All transformation scripts follow this pattern:
#!/usr/bin/env python3
import argparse
import json
def main():
parser = argparse.ArgumentParser(description='Data transformation script')
parser.add_argument('-i', help='Comma-separated input file paths')
parser.add_argument('-o', required=True, help='Output file path')
args = parser.parse_args()
# Load and parse input data
input_data = load_input_files(args.i)
# Apply field mappings with transformations
transformed_data = {
"summary": transform_summary(input_data),
"priority": map_severity_to_priority(input_data.get("severity")),
"description": combine_description_fields(input_data),
# ... other mapped fields
}
# Write single JSON object output
with open(args.o, 'w') as f:
json.dump(transformed_data, f, indent=2)
if __name__ == "__main__":
main()
Script Regeneration Options
Always Regenerate Script (Default: False)
- True: Regenerates transformation script on every execution with fresh schema analysis
- False: Reuses existing transformation script for consistent behavior
When to Use Each Option
Always Regenerate = True:
- Target integration schema might change
- Input data structure varies between executions
- You're refining transformation logic
- Development and testing phase
Always Regenerate = False:
- Stable integration schemas and data formats
- Production workflows requiring consistent behavior
- Performance-critical workflows
- Validated transformation logic
Integration with Other Nodes
Upstream Node Compatibility
Data Transformation Agent Nodes can process output from:
- Operation Nodes: Vulnerability scanners, security tools
- Input Nodes: Uploaded security reports
- Integration Nodes: Data from other services
- Script/Agent Nodes: Processed security data
Downstream Node Requirements
Currently the following nodes are supported:
- Integration Nodes: The target integration that will consume the transformed data
Performance Considerations
AI Processing Time
- Initial transformation generation: 60-120 seconds
- Subsequent executions (with reuse): No additional AI processing time
Best Practices
Workflow Design
- Always connect to Integration Node: Required for schema analysis
- Test with real data: Validate transformations with actual security tool outputs
- Monitor unmapped fields: Review what data isn't being used
- Handle missing required fields: Plan for incomplete input data
Prompt Writing
- Focus on business logic: Describe the transformation intent, not implementation
- Specify aggregation: How to handle multiple findings or data points
- Consider data quality: How to handle missing or malformed input
- Integration context: Reference the target integration's purpose
Troubleshooting
| Issue | Resolution |
|---|---|
| No target integration detected | Ensure downstream Integration Node is connected |
| Schema resolution failed | Verify Integration Node configuration is complete |
| Field mapping errors | Review unmapped fields and adjust prompt |
| Transformation script errors | Check input data format and integration requirements |
| Missing required fields | Provide defaults or modify input data collection |
Common Error Patterns
"No target node found for data transformation":
- Connect a downstream Integration Node
- Ensure Integration Node is properly configured
"Target integration node must have a resource specified":
- Partial Integration Node Configuration Required:
- Integration Service must be selected
- Resource Type must be specified
- Operation must be chosen
- Verify ALL configuration elements are properly set
"Field mapping validation failed":
- Review the unmapped fields in the AI Summary
- Adjust prompt to handle missing data scenarios
- Check if input data structure matches expectations
Example Configurations
Example 1: Nmap to Jira Security Issue
Input: Nmap port scan results Target: Jira issue creation Prompt:
Create Jira security tickets for open ports found in the network scan.
Use the host IP as the summary, list all open ports in the description,
and set priority based on the number of open ports: >10 ports = High,
5-10 = Medium, <5 = Low priority.
Generated Mapping:
host→summaryopen_ports[]→description(array_to_string)port_count→priority(with custom logic)
Example 2: Vulnerability Scan to Slack Alert
Input: Vulnerability assessment report Target: Slack channel message Prompt:
Send security alerts to Slack for critical and high vulnerabilities.
Include affected systems count, top 3 vulnerability types, and
overall risk assessment. Format as an urgent security notification.
Generated Features:
- Aggregates multiple vulnerabilities into summary statistics
- Filters by severity level
- Creates business-friendly messaging
- Includes actionable information
Example 3: Multi-Source Security Dashboard
Input: Combined subdomain enumeration + vulnerability scan Target: Custom integration API Prompt:
Create a security posture report combining discovery and vulnerability data.
Show total assets discovered, vulnerability distribution by severity,
and risk score calculation. Include remediation priorities.
AI Capabilities Demonstrated:
- Cross-correlates data from multiple sources
- Performs statistical analysis and aggregation
- Creates executive-level summary information
- Maintains data lineage and confidence scores