Scripting Agent Node Guide
Overview
The Scripting Agent Node is an advanced AI-powered component that automatically generates Python scripts based on your natural language prompts. Unlike regular Script Nodes where you write code manually, Scripting Agent Nodes leverage Michelagenlo's multi-agent AI system to analyze your input data, understand your requirements, and generate optimized Python scripts tailored to your specific workflow needs.
Use Cases
- Intelligent Data Processing: Automatically generate scripts that adapt to your input data structure
- Dynamic Analysis: Create custom analysis scripts based on upstream node outputs
- Smart Transformations: Generate data transformation logic without manual coding
- Adaptive Workflows: Scripts that adjust based on the actual data they receive
- Rapid Prototyping: Quickly generate complex data processing logic from simple descriptions
- No-Code Automation: Enable non-programmers to create sophisticated data processing workflows
How Scripting Agent Nodes Work
The Scripting Agent Node uses a sophisticated AI workflow that operates in multiple stages:
- Input Analysis: The AI analyzes data from upstream nodes, detecting formats, structures, and patterns
- JSON Pattern Extraction: Automatically identifies JSON structures and recommends parsing strategies
- Context Understanding: Builds comprehensive understanding of your data and requirements
- Script Generation: Creates optimized Python scripts using advanced prompt engineering
Creating a Scripting Agent Node
Basic Setup
- Drag a Scripting Agent Node from the node palette onto your workflow canvas
- Connect it to one or more input nodes (the AI will analyze their outputs)
- Write a natural language prompt describing what you want the script to do
[SCREENSHOT: Scripting Agent Node being added to a workflow]
Configuration Options
Node Properties
| Property | Description |
|---|---|
| Name | A descriptive name for the node |
| Prompt | Natural language description of what you want the script to do |
| Timeout | Maximum execution time in seconds (default: 600 for AI processing + execution) |
| Always Regenerate Script | If true, regenerates script on every execution; if false, reuses existing script |
| AI summary | AI explanation of what the generated script achieves |
Advanced Properties
| Property | Description |
|---|---|
| Generated Script | The Python script created by the AI (auto-populated, can be edited) |
| AI Summary | Explanation of what the generated script achieves (read-only, auto-populated) |
Writing Effective Prompts
The quality of your prompt directly impacts the generated script. Follow these guidelines:
Prompt Best Practices
- Be Specific: Clearly describe what you want the script to accomplish
- Mention Data Types: Reference the expected input format if known. For example if input data is a JSON containing keys of interest
- Include Processing Logic: Explain any filtering, transformation, or analysis requirements
- Consider Edge Cases: Mention how to handle missing data or errors
Example Prompts
Basic Data Processing
Extract all unique domains from the input data, remove any domains containing 'test' or 'dev',
and output them as a simple list, one domain per line.
Advanced Analysis
Analyze the vulnerability scan results to identify critical and high severity findings.
Group them by severity level, count occurrences of each vulnerability type,
and generate a summary report in JSON format with statistics and top 10 most common issues.
Data Transformation
Parse the JSON output from the security tool, extract IP addresses and their associated
ports, and create a JSON file with the keys: ip_address, port, service, status.
Include only entries where status is 'open'.
Multi-Input Processing
Combine the subdomain enumeration results with the port scan data. For each subdomain,
add the corresponding open ports if available, and output as JSON with the structure:
{"domain": "example.com", "ports": [80, 443, 8080]}.
How Script Generation Works
When a Scripting Agent Node executes:
- Data Collection: The AI analyzes output from all connected upstream nodes
- Pattern Recognition: Automatically detects data formats (JSON, CSV, plain text, etc.)
- Context Building: Creates a comprehensive understanding of data structure and relationships
- Script Creation: Generates Python code that includes:
- Proper argument parsing adhering to NINA logic (
-ifor input files,-ofor output file) - Error handling and validation
- Optimized data processing logic
- Appropriate library usage
- Proper argument parsing adhering to NINA logic (
Generated Script Structure
All generated scripts follow this pattern:
#!/usr/bin/env python3
import argparse
import json
# Other imports as needed
def main():
parser = argparse.ArgumentParser(description='Generated script description')
parser.add_argument('-i', help='Comma-separated input file paths')
parser.add_argument('-o', required=True, help='Output file path')
args = parser.parse_args()
# Handle input files
if args.i:
input_files = {path.split('/')[-1]: path for path in args.i.split(',')}
else:
input_files = {}
# Processing logic generated by AI
results = process_data(input_files)
# Write results to output file
with open(args.o, 'w') as f:
# Output logic generated by AI
write_results(f, results)
if __name__ == "__main__":
main()
Script Regeneration Options
Always Regenerate Script (Default: False)
- True: Script is regenerated on every workflow execution using fresh analysis of input data
- False: Script is generated once and reused unless manually cleared
When to Use Each Option
Always Regenerate = True:
- Input data structure changes frequently
- You want the script to adapt to different data formats
- You're still refining your prompt and want to see changes
- Data processing requirements vary between executions
Always Regenerate = False:
- Input data has consistent structure
- You want faster execution (no AI processing time)
- The generated script works well and you want to preserve it
- You want predictable behavior
Available Libraries
Python Libraries Available
- Data processing:
pandas,numpy,json,csv,python-pptx - Network tools:
requests,httpx,urllib3,dnspython,netaddr - Security tools:
cryptography,paramiko,scapy - Text processing:
bs4(BeautifulSoup),lxml - File operations: Standard library modules
Integration with Other Nodes
Upstream Node Compatibility
Scripting Agent Nodes can process output from:
- Input Nodes: Files uploaded by users
- Operation Nodes: Security tool outputs (subfinder, nmap, etc.)
- Integration Nodes: Data from external services
- Other Script/Agent Nodes: Processed data from previous steps
Downstream Node Usage
Generated outputs can be consumed by:
- Integration Nodes: Send processed data to external services
- Report Nodes: Generate reports from processed data
- Other Script/Agent Nodes: Further processing and analysis
- Other Nodes: Save final results
Performance Considerations
AI Processing Time
- Initial script generation: 40-90 seconds depending on complexity
- Subsequent executions (with reuse): No additional AI processing time
- Complex data analysis may require additional processing time
Best Practices
Prompt Writing
- Start Simple: Begin with basic requirements and refine as needed
- Test Iteratively: Use "Always Regenerate" during development until satisfied
- Be Explicit: Specify exact output formats and needed data structures
- Consider Scale: Mention if you expect large datasets
Workflow Design
- Data Flow: Ensure upstream nodes provide the expected data format
- Error Handling: Consider what logic must be followed if input data is malformed
- Performance: Use script reuse for production workflows
- Testing: Validate generated scripts with sample data
Troubleshooting
| Issue | Resolution |
|---|---|
| Script generation fails | Check prompt clarity and input data availability |
| Generated script errors | Review input data format and prompt requirements |
| Timeout during AI processing | Simplify prompt and input data |
| Unexpected output format | Be more specific about desired output structure |
| Overloaded errors | Sometimes AI providers suffer downtime, which by consequence affects the agents |
| Performance issues | Consider using script reuse or optimizing data processing |
| Thread spawning errors with Excel files | Add thread-limiting environment variables before imports (see below) |
Common Error Patterns
"No input files specified":
- Ensure upstream nodes are properly connected
- Check that upstream nodes have completed successfully
"Failed to parse input data":
- Verify input data format matches expectations
- Check for malformed JSON or unexpected data structures
"AI service not available":
- Sometimes AI providers suffer downtime
- Try again after a short delay
Excel File Processing - Thread Spawning Fix
When processing Excel (.xlsx) files, the node may fail due to thread spawning restrictions. Excel libraries attempt to use multiple threads for decompression, which isn't allowed in the execution environment.
Solution: Prefix your prompt with thread-limiting instructions when working with Excel files.
Example prompt with Excel handling:
Before importing any libraries except "os", include this code:
os.environ['OPENBLAS_NUM_THREADS'] = '1'
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'
os.environ['NUMEXPR_NUM_THREADS'] = '1'
os.environ['VECLIB_MAXIMUM_THREADS'] = '1'
Then proceed with the rest of the script.
Read the Excel file, extract rows where 'Status' is 'Active',
and output as a new Excel file with columns: ID, Name, Status, Date.
Note: This workaround is only needed for Excel files (.xlsx, .xls). CSV, JSON, and text files work without these environment variables.
Example Configurations
Example 1: Domain Analysis from Subfinder
Prompt:
Extract all unique domains from the subfinder results, filter out any containing 'test', 'dev',
or 'staging', and create a ranked list by subdomain depth. Output as JSON with domain and depth.
Generated Output:
[
{"domain": "api.example.com", "depth": 2},
{"domain": "admin.api.example.com", "depth": 3},
{"domain": "secure.admin.api.example.com", "depth": 4}
]
Example 2: Vulnerability Report Generation
Prompt:
Parse the vulnerability scan results and create a security report. Group findings by severity,
include CVSS scores where available, and generate an executive summary with total counts
and risk assessment. Output as structured JSON.
Generated Features:
- Automatic CVSS score parsing
- Risk level categorization
- Statistical analysis
- Executive summary generation
Example 3: Multi-Tool Data Correlation
Prompt:
Combine subdomain enumeration with port scan results. For each discovered subdomain,
add the associated open ports and services. Include only subdomains with at least one open port.
Format as CSV with columns: subdomain, ports, services, risk_level.
AI Capabilities Demonstrated:
- Cross-referencing data from multiple tools
- Intelligent data correlation
- Risk assessment logic
- Flexible output formatting