Scripting Agent Node Guide

Overview

The Scripting Agent Node is an advanced AI-powered component that automatically generates Python scripts based on your natural language prompts. Unlike regular Script Nodes where you write code manually, Scripting Agent Nodes leverage Michelagenlo's multi-agent AI system to analyze your input data, understand your requirements, and generate optimized Python scripts tailored to your specific workflow needs.

Use Cases

Intelligent Data Processing: Automatically generate scripts that adapt to your input data structure
Dynamic Analysis: Create custom analysis scripts based on upstream node outputs
Smart Transformations: Generate data transformation logic without manual coding
Adaptive Workflows: Scripts that adjust based on the actual data they receive
Rapid Prototyping: Quickly generate complex data processing logic from simple descriptions
No-Code Automation: Enable non-programmers to create sophisticated data processing workflows

How Scripting Agent Nodes Work

The Scripting Agent Node uses a sophisticated AI workflow that operates in multiple stages:

Input Analysis: The AI analyzes data from upstream nodes, detecting formats, structures, and patterns
JSON Pattern Extraction: Automatically identifies JSON structures and recommends parsing strategies
Context Understanding: Builds comprehensive understanding of your data and requirements
Script Generation: Creates optimized Python scripts using advanced prompt engineering

Creating a Scripting Agent Node

Basic Setup

Drag a Scripting Agent Node from the node palette onto your workflow canvas
Connect it to one or more input nodes (the AI will analyze their outputs)
Write a natural language prompt describing what you want the script to do

[SCREENSHOT: Scripting Agent Node being added to a workflow]

Configuration Options

Node Properties

Property	Description
Name	A descriptive name for the node
Prompt	Natural language description of what you want the script to do
Timeout	Maximum execution time in seconds (default: 600 for AI processing + execution)
Always Regenerate Script	If true, regenerates script on every execution; if false, reuses existing script
AI summary	AI explanation of what the generated script achieves

Advanced Properties

Property	Description
Generated Script	The Python script created by the AI (auto-populated, can be edited)
AI Summary	Explanation of what the generated script achieves (read-only, auto-populated)

Writing Effective Prompts

The quality of your prompt directly impacts the generated script. Follow these guidelines:

Prompt Best Practices

Be Specific: Clearly describe what you want the script to accomplish
Mention Data Types: Reference the expected input format if known. For example if input data is a JSON containing keys of interest
Include Processing Logic: Explain any filtering, transformation, or analysis requirements
Consider Edge Cases: Mention how to handle missing data or errors

Example Prompts

Basic Data Processing

Extract all unique domains from the input data, remove any domains containing 'test' or 'dev', 
and output them as a simple list, one domain per line.

Advanced Analysis

Analyze the vulnerability scan results to identify critical and high severity findings. 
Group them by severity level, count occurrences of each vulnerability type, 
and generate a summary report in JSON format with statistics and top 10 most common issues.

Data Transformation

Parse the JSON output from the security tool, extract IP addresses and their associated 
ports, and create a JSON file with the keys: ip_address, port, service, status. 
Include only entries where status is 'open'.

Multi-Input Processing

Combine the subdomain enumeration results with the port scan data. For each subdomain, 
add the corresponding open ports if available, and output as JSON with the structure: 
{"domain": "example.com", "ports": [80, 443, 8080]}.

How Script Generation Works

When a Scripting Agent Node executes:

Data Collection: The AI analyzes output from all connected upstream nodes
Pattern Recognition: Automatically detects data formats (JSON, CSV, plain text, etc.)
Context Building: Creates a comprehensive understanding of data structure and relationships
Script Creation: Generates Python code that includes:
- Proper argument parsing adhering to NINA logic (-i for input files, -o for output file)
- Error handling and validation
- Optimized data processing logic
- Appropriate library usage

Generated Script Structure

All generated scripts follow this pattern:

#!/usr/bin/env python3
import argparse
import json
# Other imports as needed

def main():
    parser = argparse.ArgumentParser(description='Generated script description')
    parser.add_argument('-i', help='Comma-separated input file paths')
    parser.add_argument('-o', required=True, help='Output file path')
    args = parser.parse_args()
    
    # Handle input files
    if args.i:
        input_files = {path.split('/')[-1]: path for path in args.i.split(',')}
    else:
        input_files = {}
    
    # Processing logic generated by AI
    results = process_data(input_files)
    
    # Write results to output file
    with open(args.o, 'w') as f:
        # Output logic generated by AI
        write_results(f, results)

if __name__ == "__main__":
    main()

Script Regeneration Options

Always Regenerate Script (Default: False)

True: Script is regenerated on every workflow execution using fresh analysis of input data
False: Script is generated once and reused unless manually cleared

When to Use Each Option

Always Regenerate = True:

Input data structure changes frequently
You want the script to adapt to different data formats
You're still refining your prompt and want to see changes
Data processing requirements vary between executions

Always Regenerate = False:

Input data has consistent structure
You want faster execution (no AI processing time)
The generated script works well and you want to preserve it
You want predictable behavior

Available Libraries

Python Libraries Available

Data processing: pandas, numpy, json, csv, python-pptx
Network tools: requests, httpx, urllib3, dnspython, netaddr
Security tools: cryptography, paramiko, scapy
Text processing: bs4 (BeautifulSoup), lxml
File operations: Standard library modules

Integration with Other Nodes

Upstream Node Compatibility

Scripting Agent Nodes can process output from:

Input Nodes: Files uploaded by users
Operation Nodes: Security tool outputs (subfinder, nmap, etc.)
Integration Nodes: Data from external services
Other Script/Agent Nodes: Processed data from previous steps

Downstream Node Usage

Generated outputs can be consumed by:

Integration Nodes: Send processed data to external services
Report Nodes: Generate reports from processed data
Other Script/Agent Nodes: Further processing and analysis
Other Nodes: Save final results

Performance Considerations

AI Processing Time

Initial script generation: 40-90 seconds depending on complexity
Subsequent executions (with reuse): No additional AI processing time
Complex data analysis may require additional processing time

Best Practices

Prompt Writing

Start Simple: Begin with basic requirements and refine as needed
Test Iteratively: Use "Always Regenerate" during development until satisfied
Be Explicit: Specify exact output formats and needed data structures
Consider Scale: Mention if you expect large datasets

Workflow Design

Data Flow: Ensure upstream nodes provide the expected data format
Error Handling: Consider what logic must be followed if input data is malformed
Performance: Use script reuse for production workflows
Testing: Validate generated scripts with sample data

Troubleshooting

Issue	Resolution
Script generation fails	Check prompt clarity and input data availability
Generated script errors	Review input data format and prompt requirements
Timeout during AI processing	Simplify prompt and input data
Unexpected output format	Be more specific about desired output structure
Overloaded errors	Sometimes AI providers suffer downtime, which by consequence affects the agents
Performance issues	Consider using script reuse or optimizing data processing
Thread spawning errors with Excel files	Add thread-limiting environment variables before imports (see below)

Common Error Patterns

"No input files specified":

Ensure upstream nodes are properly connected
Check that upstream nodes have completed successfully

"Failed to parse input data":

Verify input data format matches expectations
Check for malformed JSON or unexpected data structures

"AI service not available":

Sometimes AI providers suffer downtime
Try again after a short delay

Excel File Processing - Thread Spawning Fix

When processing Excel (.xlsx) files, the node may fail due to thread spawning restrictions. Excel libraries attempt to use multiple threads for decompression, which isn't allowed in the execution environment.

Solution: Prefix your prompt with thread-limiting instructions when working with Excel files.

Example prompt with Excel handling:

Before importing any libraries except "os", include this code:

os.environ['OPENBLAS_NUM_THREADS'] = '1'
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'
os.environ['NUMEXPR_NUM_THREADS'] = '1'
os.environ['VECLIB_MAXIMUM_THREADS'] = '1'

Then proceed with the rest of the script.

Read the Excel file, extract rows where 'Status' is 'Active',
and output as a new Excel file with columns: ID, Name, Status, Date.

Note: This workaround is only needed for Excel files (.xlsx, .xls). CSV, JSON, and text files work without these environment variables.

Example Configurations

Example 1: Domain Analysis from Subfinder

Prompt:

Extract all unique domains from the subfinder results, filter out any containing 'test', 'dev', 
or 'staging', and create a ranked list by subdomain depth. Output as JSON with domain and depth.

Generated Output:

[
  {"domain": "api.example.com", "depth": 2},
  {"domain": "admin.api.example.com", "depth": 3},
  {"domain": "secure.admin.api.example.com", "depth": 4}
]

Example 2: Vulnerability Report Generation

Prompt:

Parse the vulnerability scan results and create a security report. Group findings by severity, 
include CVSS scores where available, and generate an executive summary with total counts 
and risk assessment. Output as structured JSON.

Generated Features:

Automatic CVSS score parsing
Risk level categorization
Statistical analysis
Executive summary generation

Example 3: Multi-Tool Data Correlation

Prompt:

Combine subdomain enumeration with port scan results. For each discovered subdomain, 
add the associated open ports and services. Include only subdomains with at least one open port. 
Format as CSV with columns: subdomain, ports, services, risk_level.

AI Capabilities Demonstrated:

Cross-referencing data from multiple tools
Intelligent data correlation
Risk assessment logic
Flexible output formatting

Overview​

Use Cases​

How Scripting Agent Nodes Work​

Creating a Scripting Agent Node​

Basic Setup​

Configuration Options​

Node Properties​

Advanced Properties​

Writing Effective Prompts​

Prompt Best Practices​

Example Prompts​

Basic Data Processing​

Advanced Analysis​

Data Transformation​

Multi-Input Processing​

How Script Generation Works​

Generated Script Structure​

Script Regeneration Options​

Always Regenerate Script (Default: False)​

When to Use Each Option​

Available Libraries​

Python Libraries Available​

Integration with Other Nodes​

Upstream Node Compatibility​

Downstream Node Usage​

Performance Considerations​

AI Processing Time​

Best Practices​

Prompt Writing​

Workflow Design​

Troubleshooting​

Common Error Patterns​

Excel File Processing - Thread Spawning Fix​

Example Configurations​

Example 1: Domain Analysis from Subfinder​

Example 2: Vulnerability Report Generation​

Example 3: Multi-Tool Data Correlation​

Overview

Use Cases

How Scripting Agent Nodes Work

Creating a Scripting Agent Node

Basic Setup

Configuration Options

Node Properties

Advanced Properties

Writing Effective Prompts

Prompt Best Practices

Example Prompts

Basic Data Processing

Advanced Analysis

Data Transformation

Multi-Input Processing

How Script Generation Works

Generated Script Structure

Script Regeneration Options

Always Regenerate Script (Default: False)

When to Use Each Option

Available Libraries

Python Libraries Available

Integration with Other Nodes

Upstream Node Compatibility

Downstream Node Usage

Performance Considerations

AI Processing Time

Best Practices

Prompt Writing

Workflow Design

Troubleshooting

Common Error Patterns

Excel File Processing - Thread Spawning Fix

Example Configurations

Example 1: Domain Analysis from Subfinder

Example 2: Vulnerability Report Generation

Example 3: Multi-Tool Data Correlation