Testing
nat-zero is integration-tested against real AWS infrastructure on every PR. The test deploys the full module, exercises the complete NAT lifecycle, then tears everything down.
Running Tests
# Unit tests (Lambda logic)
cd cmd/lambda && go test -v -race ./...
# Integration tests (requires AWS credentials)
cd tests/integration && go test -v -timeout 30m
Integration tests require AWS credentials with permissions to manage EC2, IAM, Lambda, EventBridge, and CloudWatch resources.
Integration Test Lifecycle
The test uses Terratest with a single terraform apply / destroy cycle and four phases:
Phase 1: NAT Creation and Connectivity
- Deploy fixture (private subnet + nat-zero module in default VPC)
- Launch workload instance in private subnet
- Invoke Lambda → creates NAT instance
- Wait for NAT running with EIP attached
- Verify workload's egress IP matches NAT's EIP
Phase 2: Scale-Down
- Terminate workload
- Invoke Lambda → stops NAT
- Wait for NAT stopped
- Invoke Lambda → releases EIP
- Verify no EIPs remain
Phase 3: Restart
- Launch new workload
- Invoke Lambda → restarts stopped NAT
- Wait for NAT running with new EIP
- Verify connectivity
Phase 4: Cleanup Action
- Invoke Lambda with
{action: "cleanup"} - Verify all NAT instances terminated and EIPs released
Teardown
terraform destroy removes all Terraform-managed resources. The cleanup action (Phase 4) ensures Lambda-created NAT instances are terminated first, so ENI deletion succeeds.
CI
Integration tests run in GitHub Actions when the integration-test label is added to a PR. They use OIDC to assume an AWS role in a dedicated test account.
- Concurrency: one test at a time (
cancel-in-progress: false) - Timeout: 15 minutes
- Region: us-east-1
Orphan Detection
TestNoOrphanedResources runs after the main test and checks for leftover AWS resources with the nat-test prefix (subnets, ENIs, security groups, Lambda functions, IAM roles, EIPs). If any are found, it fails and lists them for manual cleanup.
Config Version Replacement
The Lambda tags NAT instances with a ConfigVersion hash (AMI + instance type + market type + volume size). When the config changes and a workload triggers reconciliation, the Lambda terminates the outdated NAT and creates a replacement. The integration test doesn't exercise this path directly, but it's covered by unit tests.