Skip to main content

AWS S3 Integration Guide

Overview

The AWS S3 integration allows your NINA workflows to connect with Amazon Simple Storage Service (S3) for comprehensive cloud storage operations. This integration enables you to manage buckets, upload and download files, perform bulk operations with advanced filtering, copy files between buckets, and organize your cloud storage directly from your workflows.

Status

The integration currently supports comprehensive S3 storage operations:

  • Bucket Management: List and search bucket contents with prefix filtering
  • File Upload: Upload text and binary files to S3 buckets
  • File Download: Download individual files or perform bulk downloads with advanced filtering
  • File Management: Copy files between buckets and delete files
  • Advanced Filtering: Support for wildcard patterns, date ranges, size filters, and S3-specific options
  • Flexible Authentication: Support for AWS credentials, custom endpoints, and path-style access

Some advanced features include:

  • Bulk Download Operations: Download multiple files with sophisticated filtering criteria or specific file keys
  • Wildcard Pattern Matching: Use glob patterns for file selection (.jpg, test.txt)
  • Date Range Filtering: Filter files by modification date or relative time periods
  • Size-based Filtering: Filter files by minimum and maximum size thresholds
  • Custom Endpoints: Support for S3-compatible services and custom endpoints
  • Binary Data Handling: Upload and download both text and binary content
  • Direct File Selection: Specify exact file keys for bulk download operations

Credential Configuration

Before using the AWS S3 integration in your workflows, you need to configure AWS credentials for authentication. The integration supports standard AWS authentication methods with regional configuration.

Authentication Method

The AWS S3 integration uses AWS credentials with regional configuration:

FieldDescriptionExample
Access Key IDAWS access key identifierAKIAIOSFODNN7EXAMPLE
Secret Access KeyAWS secret access keywJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
RegionAWS region for S3 operationsus-east-1
Session TokenAWS session token (for temporary credentials)AQoEXAMPLEH4aoAH0gNCAPyJxz4BlCFFxWNE1OPTgk...
Custom EndpointCustom S3 endpoint (for S3-compatible services)https://s3.custom-provider.com
Force Path StyleUse path-style addressing instead of virtual-hosted stylefalse

How to obtain AWS Credentials:

  1. Log into the AWS Console
  2. Navigate to IAM (Identity and Access Management)
  3. Create a new user or use an existing user
  4. Attach the appropriate S3 permissions policy (e.g., AmazonS3FullAccess or custom policy)
  5. Generate access keys for the user
  6. Note the Access Key ID and Secret Access Key
  7. Determine the appropriate AWS region for your S3 buckets

Required S3 Permissions:

  • s3:ListBucket - For listing bucket contents
  • s3:GetObject - For downloading files
  • s3:PutObject - For uploading files
  • s3:DeleteObject - For deleting files
  • s3:CopyObject - For copying files between buckets
  • s3:ListAllMyBuckets - For listing all buckets

Creating an AWS S3 Credential

  1. Navigate to the Credentials section in NINA

  2. Click Add New Credential

  3. Fill in the credential details:

    • Name: A descriptive name (e.g., "AWS S3 Production")
    • Description: Optional details about the credential's purpose
    • Integration Service: Select "AWS S3"
    • Access Key ID: Enter your AWS access key ID
    • Secret Access Key: Enter your AWS secret access key
    • Region: Enter your AWS region (e.g., "us-east-1")
    • Session Token: (Optional) For temporary credentials
    • Custom Endpoint: (Optional) For S3-compatible services
    • Force Path Style: (Optional) Enable path-style addressing
  4. Click Test Connection to verify credentials

  5. Click Save to store the credential

Supported Resources and Operations

The AWS S3 integration supports the following resources and operations:

Bucket

OperationDescription
Get ManyList all S3 buckets in your account
SearchSearch and list objects within a specific bucket with filtering

File

OperationDescription
UploadUpload a file (text or binary) to an S3 bucket
Bulk UploadUpload multiple files to S3 with base64 encoded or plain text content
DownloadDownload a specific file from an S3 bucket
Bulk DownloadDownload multiple files with advanced filtering capabilities or specific file keys
CopyCopy a file from one S3 location to another
DeleteDelete a specific file from an S3 bucket
Get ManyList files in a specific bucket with optional filtering

Parameter Merging

The AWS S3 integration takes full advantage of NINA's parameter merging capabilities:

Parameter Sources (in order of precedence)

  1. Node Parameters: Parameters configured directly in the AWS S3 Integration Node
  2. Extracted Parameters: Parameters automatically extracted from the input data
  3. Input Data: The complete input data from upstream nodes

When an AWS S3 Integration Node executes:

  • It combines parameters from all sources
  • Node parameters take precedence over extracted parameters
  • The combined parameters are used to execute the S3 operation

Examples

Listing Buckets

Below is an example of listing all S3 buckets in your account:

Node Configuration:

{
"resource": "bucket",
"operation": "getMany",
"parameters": {
"returnAll": true
}
}

This will return all S3 buckets accessible with your credentials.

Searching Bucket Contents

Example of searching for objects within a specific bucket:

Node Configuration:

{
"resource": "bucket",
"operation": "search",
"parameters": {
"bucketName": "my-production-bucket",
"limit": 100,
"options": {
"prefix": "logs/2024/",
"delimiter": "/"
}
}
}

This will search for objects in the "my-production-bucket" bucket with the prefix "logs/2024/".

Uploading Files

Example of uploading a text file to S3:

Node Configuration:

{
"resource": "file",
"operation": "upload",
"parameters": {
"bucketName": "my-data-bucket",
"fileKey": "reports/quarterly-report.txt",
"binaryData": false,
"fileContent": "Q1 2024 Sales Report\n\nTotal Revenue: $1,234,567"
}
}

This will upload a text file to the specified bucket and key path.

Uploading Binary Files

Example of uploading a binary file:

Node Configuration:

{
"resource": "file",
"operation": "upload",
"parameters": {
"bucketName": "my-images-bucket",
"fileKey": "photos/image.jpg",
"binaryData": true,
"fileContent": "base64-encoded-image-data-here"
}
}

This will upload a binary file (image) using base64-encoded content.

Bulk Upload Files

Example of uploading multiple files in a single operation:

Node Configuration:

{
"resource": "file",
"operation": "bulkUpload",
"parameters": {
"bucketName": "my-documents-bucket",
"files": [
{
"fileKey": "reports/q1-sales.pdf",
"data": "base64-encoded-pdf-content",
"binaryData": true
},
{
"fileKey": "configs/settings.json",
"data": "{\"environment\": \"production\", \"debug\": false}",
"binaryData": false
},
{
"fileKey": "images/logo.png",
"data": "base64-encoded-image-content",
"binaryData": true
}
]
}
}

This will upload three files in a single operation: a PDF report, a JSON configuration file, and a PNG image.

Bulk Upload with Additional Options

Example of bulk upload with storage class and encryption settings:

Node Configuration:

{
"resource": "file",
"operation": "bulkUpload",
"parameters": {
"bucketName": "my-archive-bucket",
"files": [
{
"fileKey": "backup/database-export.sql",
"data": "base64-encoded-sql-content",
"binaryData": true
},
{
"fileKey": "backup/application-logs.txt",
"data": "Application started at 2024-01-15...",
"binaryData": false
}
],
"additionalFields": {
"acl": "bucket-owner-full-control",
"storageClass": "GLACIER",
"serverSideEncryption": "AES256",
"tagging": "Environment=Production&Department=IT",
"contentType": "application/octet-stream"
}
}
}

This will upload multiple backup files with Glacier storage class and AES256 encryption applied to all files.

Bulk Upload with Per-File Overrides

Files can override the global additionalFields settings:

Node Configuration:

{
"resource": "file",
"operation": "bulkUpload",
"parameters": {
"bucketName": "my-mixed-bucket",
"files": [
{
"fileKey": "public/banner.jpg",
"data": "base64-encoded-image-content",
"binaryData": true,
"acl": "public-read",
"contentType": "image/jpeg"
},
{
"fileKey": "private/sensitive-data.csv",
"data": "Name,Email,Phone\nJohn Doe,[email protected],555-1234",
"binaryData": false,
"acl": "private",
"contentType": "text/csv"
}
],
"additionalFields": {
"storageClass": "STANDARD",
"serverSideEncryption": "AES256"
}
}
}

This example shows how individual files can override ACL and content type while inheriting the global storage class and encryption settings.

Downloading Files

Example of downloading a specific file:

Node Configuration:

{
"resource": "file",
"operation": "download",
"parameters": {
"bucketName": "my-documents-bucket",
"fileKey": "contracts/agreement-2024.pdf"
}
}

This will download the specified PDF file from the bucket.

Bulk Download with Filtering

Example of downloading multiple files with advanced filtering:

Node Configuration:

{
"resource": "file",
"operation": "bulkDownload",
"parameters": {
"bucketName": "my-logs-bucket",
"limit": 50,
"filters": {
"namePattern": "*.log",
"lastDays": 7,
"minSize": 1024,
"maxSize": 10485760
}
}
}

This will download up to 50 log files from the last 7 days, between 1KB and 10MB in size.

Bulk Download with Specific File Keys

Example of downloading specific files by providing exact file keys:

Node Configuration:

{
"resource": "file",
"operation": "bulkDownload",
"parameters": {
"bucketName": "my-documents-bucket",
"fileKeys": [
"reports/2024/q1-sales.pdf",
"reports/2024/q1-marketing.pdf",
"invoices/2024/january/invoice-001.pdf",
"contracts/client-agreement-2024.pdf"
]
}
}

This will download the four specific files listed in the fileKeys array, skipping any listing or filtering operations for better performance.

Copying Files

Example of copying a file between buckets:

Node Configuration:

{
"resource": "file",
"operation": "copy",
"parameters": {
"sourceBucketName": "source-bucket",
"sourceFileKey": "data/original-file.csv",
"destinationBucketName": "backup-bucket",
"destinationFileKey": "backups/2024/original-file.csv"
}
}

This will copy the file from the source bucket to the destination bucket with a new path.

Deleting Files

Example of deleting a specific file:

Node Configuration:

{
"resource": "file",
"operation": "delete",
"parameters": {
"bucketName": "temporary-bucket",
"fileKey": "temp/processing-file.tmp"
}
}

This will delete the specified temporary file from the bucket.

Listing Files

Example of listing files in a bucket with filtering:

Node Configuration:

{
"resource": "file",
"operation": "getMany",
"parameters": {
"bucketName": "my-archive-bucket",
"limit": 100,
"options": {
"prefix": "archive/2024/",
"startAfter": "archive/2024/january/"
}
}
}

This will list up to 100 files in the archive folder for 2024, starting after the January folder.

Advanced Filtering Options

Bulk Download Filtering

The bulkDownload operation supports two main approaches: filtering-based download and direct file key specification.

Method 1: Filtering-Based Download

When using filters, the integration lists bucket contents and applies the specified criteria:

Name-based Filters
{
"filters": {
"namePattern": "*.jpg", // Wildcard patterns
"namePrefix": "report-", // Files starting with prefix
"nameSuffix": ".log" // Files ending with suffix
}
}
Date-based Filters
{
"filters": {
"minDate": "2024-01-01T00:00:00Z", // RFC3339 format
"maxDate": "2024-12-31T23:59:59Z", // RFC3339 format
"lastDays": 30, // Files from last 30 days
"latest": true // Files from last 24 hours
}
}
Size-based Filters
{
"filters": {
"minSize": 1024, // Minimum size in bytes (1KB)
"maxSize": 10485760 // Maximum size in bytes (10MB)
}
}

Method 2: Direct File Key Specification

When you know the exact file keys you want to download, use the fileKeys parameter for better performance:

{
"fileKeys": [
"documents/report-2024-q1.pdf",
"images/logo.png",
"data/export-20240115.csv",
"configs/app-settings.json"
]
}

Benefits of using fileKeys:

  • Performance: Skips listing and filtering operations
  • Precision: Downloads only the exact files you specify
  • Efficiency: Reduces API calls and processing time
  • Reliability: No dependency on S3 listing permissions

S3-specific Options

{
"options": {
"prefix": "folder/subfolder/", // Server-side prefix filtering
"delimiter": "/", // Key grouping delimiter
"startAfter": "folder/file.txt" // Pagination cursor
}
}

Note: When fileKeys is provided, filtering options and S3-specific options are ignored since the integration downloads the specified files directly.

Wildcard Pattern Support

The integration supports glob-style wildcard patterns for file selection when using filtering:

Pattern Syntax

PatternDescriptionExample Matches
*Any sequence of characters*.txt matches file.txt, document.txt
?Single charactertest?.log matches test1.log, testa.log
[abc]Character in setfile[123].txt matches file1.txt, file2.txt
[a-z]Character in rangereport[a-c].pdf matches reporta.pdf, reportb.pdf

Pattern Examples

  • *.pdf - All PDF files
  • photo*.jpg - JPG files starting with "photo"
  • report-202?-*.csv - CSV reports from any year in the 2020s
  • data/[0-9]*.json - JSON files starting with a digit in the data folder

Response Structure

List Buckets Response

The bucket getMany operation returns a list of all accessible buckets:

{
"success": true,
"buckets": [
{
"Name": "my-production-bucket",
"CreationDate": "2023-01-15T10:30:00Z"
},
{
"Name": "my-backup-bucket",
"CreationDate": "2023-02-20T14:45:00Z"
}
]
}

Search Bucket Response

The bucket search operation returns objects matching the search criteria:

{
"success": true,
"objects": [
{
"Key": "logs/2024/app.log",
"Size": 2048,
"LastModified": "2024-01-15T10:30:00Z",
"ETag": "\"abc123def456\"",
"StorageClass": "STANDARD"
}
],
"truncated": false,
"nextContinuationToken": null
}

File Upload Response

File upload operations return confirmation details:

{
"success": true,
"bucketName": "my-data-bucket",
"fileKey": "reports/quarterly-report.txt",
"ETag": "\"def789ghi012\"",
"location": "https://my-data-bucket.s3.amazonaws.com/reports/quarterly-report.txt"
}

Bulk Upload Response

Bulk upload operations return details for all uploaded files:

{
"success": true,
"bucketName": "my-documents-bucket",
"uploads": [
{
"fileKey": "reports/q1-sales.pdf",
"ETag": "\"abc123def456\"",
"location": "https://my-documents-bucket.s3.amazonaws.com/reports/q1-sales.pdf",
"size": 245760
},
{
"fileKey": "configs/settings.json",
"ETag": "\"ghi789jkl012\"",
"location": "https://my-documents-bucket.s3.amazonaws.com/configs/settings.json",
"size": 156
},
{
"fileKey": "images/logo.png",
"ETag": "\"mno345pqr678\"",
"location": "https://my-documents-bucket.s3.amazonaws.com/images/logo.png",
"size": 15420
}
],
"totalFiles": 3,
"totalSize": 261336,
"failedUploads": []
}

File Download Response

File download operations return the file content and metadata:

{
"success": true,
"bucketName": "my-documents-bucket",
"fileKey": "contracts/agreement-2024.pdf",
"data": "base64-encoded-file-content",
"contentType": "application/pdf",
"contentLength": 245760,
"lastModified": "2024-01-15T10:30:00Z",
"ETag": "\"jkl345mno678\""
}

Bulk Download Response

Bulk download operations return multiple files with summary information:

{
"success": true,
"bucketName": "my-logs-bucket",
"downloads": [
{
"key": "app.log",
"data": "base64-encoded-log-content",
"size": 4096,
"contentType": "text/plain",
"lastModified": "2024-01-15T10:30:00Z"
}
],
"totalFiles": 25,
"filteredCount": 1,
"totalSize": 4096,
"method": "filtering" // or "fileKeys" when using direct file specification
}

Integration in Workflow Context

The AWS S3 integration is particularly effective for data storage, backup, and file processing workflows:

Common Workflow Patterns:

  1. Data Backup and Archival:

    • Database Export Node → Script Node (format data) → S3 Node (upload) → Email Node (confirmation)
  2. Batch File Processing:

    • File Processing Node → Script Node (prepare batch) → S3 Node (bulk upload) → Notification Node (completion alert)
  3. Log Processing and Analysis:

    • Schedule Node → S3 Node (bulk download logs) → Script Node (parse logs) → Database Node (store metrics)
  4. File Synchronization:

    • FTP Node (download) → S3 Node (upload) → S3 Node (copy to backup bucket) → Cleanup Node
  5. Document Processing Pipeline:

    • Email Node (receive attachments) → S3 Node (upload) → Processing Service Node → S3 Node (upload results)
  6. Multi-Source Data Aggregation:

    • Multiple API Nodes → Script Node (format as files) → S3 Node (bulk upload) → Data Pipeline Node
  7. Content Distribution:

    • CMS Node (export content) → S3 Node (upload) → CDN Integration Node (invalidate cache)
  8. Data Lake Operations:

    • Multiple Data Sources → S3 Node (upload raw data) → ETL Process → S3 Node (upload processed data)
  9. Specific File Retrieval:

    • Database Node (get file list) → Script Node (format file keys) → S3 Node (bulk download with fileKeys) → Processing Node

Best Practices

  1. Use Appropriate Bucket Naming: Follow AWS S3 bucket naming conventions and consider organizational hierarchy.

  2. Implement Proper IAM Policies: Use least-privilege access and specific S3 permissions for your credentials.

  3. Optimize File Keys: Use logical folder structures and consistent naming patterns for better organization.

  4. Choose the Right Download Method:

    • Use fileKeys when you know exactly which files to download for better performance
    • Use filtering when you need to discover files based on criteria
  5. Handle Large Files Carefully: Consider using multipart uploads for files larger than 100MB (handled automatically).

  6. Monitor Costs: Be aware of S3 storage classes and transfer costs when designing workflows.

  7. Use Bulk Operations Efficiently: Leverage bulk download filtering to minimize API calls and transfer costs.

  8. Implement Error Handling: Add comprehensive error handling for network issues, permission problems, and file conflicts.

  9. Cache Frequently Accessed Data: Consider caching small, frequently accessed files to reduce API calls.

  10. Use Lifecycle Policies: Implement S3 lifecycle policies for automatic data archival and deletion.

  11. Validate File Content: Always validate uploaded and downloaded file content, especially for binary data.

  12. Optimize Bulk Operations: Use bulk upload for multiple files to reduce API overhead and improve efficiency.

  13. Handle Mixed Content Types: When using bulk upload, properly set binaryData for each file based on its content type.

  14. Batch Size Considerations: For bulk uploads, balance batch size with memory usage and error recovery requirements.

  15. Per-File Configuration: Leverage per-file overrides in bulk upload to set appropriate ACLs and content types for different file types.

S3 Storage Classes

AWS S3 offers different storage classes for cost optimization:

  • STANDARD: General-purpose storage for frequently accessed data
  • STANDARD_IA: Infrequent access storage with lower storage cost
  • ONEZONE_IA: Single availability zone storage for infrequent access
  • GLACIER: Archive storage for long-term backup
  • DEEP_ARCHIVE: Lowest-cost storage for long-term archival

Security Considerations

  1. Encrypt Sensitive Data: Use S3 server-side encryption or client-side encryption for sensitive files.

  2. Secure Credentials: Store AWS credentials securely and rotate them regularly.

  3. Bucket Policies: Implement appropriate bucket policies to control access.

  4. VPC Endpoints: Consider using VPC endpoints for internal AWS communication.

  5. Access Logging: Enable S3 access logging for audit and security monitoring.

Troubleshooting

Common Issues and Solutions

IssuePossible Solution
Access DeniedVerify IAM permissions include required S3 actions for the bucket
Bucket Not FoundCheck bucket name spelling and ensure bucket exists in the specified region
Invalid Key NameEnsure file keys follow S3 naming conventions (no leading slashes, etc.)
File Too LargeFor files >5GB, ensure multipart upload is supported or split the file
Connection TimeoutCheck network connectivity and consider increasing timeout values
Invalid RegionVerify the bucket region matches your credential configuration
Forbidden OperationCheck bucket policies and ensure your IAM user has necessary permissions
Slow Upload/DownloadConsider using S3 Transfer Acceleration for better performance
File Not Found (bulk download)When using fileKeys, ensure all specified keys exist in the bucket

Error Response Format

The integration returns standardized error responses:

{
"success": false,
"error": {
"code": "NoSuchBucket",
"message": "The specified bucket does not exist",
"bucketName": "non-existent-bucket",
"operation": "upload"
}
}

Common S3 Error Codes

Error CodeDescription
NoSuchBucketThe specified bucket does not exist
NoSuchKeyThe specified key does not exist
AccessDeniedAccess to the resource is denied
InvalidBucketNameThe bucket name is not valid
BucketAlreadyExistsThe bucket name is already taken
EntityTooLargeThe file size exceeds the maximum allowed

Support

If you encounter issues with the AWS S3 integration, please contact our support team with:

  • The operation you were attempting
  • Your AWS region and bucket information (without sensitive data)
  • Any error messages received
  • The file sizes and types you were working with
  • Whether you were using filtering or direct file keys
  • The workflow context where the issue occurred

This information will help us provide faster and more accurate assistance.