Lambda Pricing: What You Actually Pay For
AWS Lambda pricing has two dimensions: the number of requests ($0.20 per 1M requests) and the duration (GB-seconds). Duration is computed as execution time × allocated memory. This means a 256MB function running for 1 second costs the same as a 512MB function running for 500ms — but the 512MB function may run in 400ms thanks to more CPU power, making it actually cheaper.
This counterintuitive relationship is the heart of Lambda cost optimization: more memory often costs less because Lambda allocates CPU proportionally to memory, and faster execution can more than compensate for the higher per-GB-second rate.
AWS Lambda Power Tuning
# Install the AWS Lambda Power Tuning tool
# This runs your Lambda 50 times at each memory setting and finds the optimum
# Option 1: Deploy via SAR (Serverless Application Repository)
aws serverlessrepo create-cloud-formation-change-set --application-id arn:aws:serverlessrepo:us-east-1:451282441545:applications/aws-lambda-power-tuning --stack-name lambda-power-tuning --capabilities CAPABILITY_IAM --parameter-overrides '[{"Name":"lambdaResource","Value":"*"}]'
# Option 2: Use the open-source tool directly
pip install aws-lambda-power-tuning
# Run tuning on a Lambda function
aws stepfunctions start-execution --state-machine-arn arn:aws:states:us-east-1:123456789:stateMachine:powerTuningStateMachine --input '{
"lambdaARN": "arn:aws:lambda:us-east-1:123456789:function:my-api-handler",
"powerValues": [128, 256, 512, 1024, 1769, 3008],
"num": 50,
"payload": {"path": "/api/users", "method": "GET"},
"parallelInvocation": true,
"strategy": "cost"
}'
Sample Power Tuning Results:
Memory (MB) | Avg Duration (ms) | Cost per 1M invocations
------------|-------------------|-----------------------
128 | 3,200 | $8.54
256 | 1,600 | $8.54
512 | 890 | $9.50
1024 | 490 | $10.44
1769 | 320 | $11.77
3008 | 240 | $15.02
Optimal for cost: 256MB ($8.54/1M)
Optimal for performance: 3008MB (but 76% more expensive)
Power tuning result: 256MB, saving $1.48/1M vs 1024MB default
At 100M invocations/month:
Previous (1024MB): $1,044/month
Optimized (256MB): $854/month
Monthly saving: $190 (18%)
Graviton3 ARM64 Lambda Functions
# Graviton3 (arm64) Lambda functions:
# - 20% cheaper than x86_64 per GB-second
# - Often 10-20% faster for compute-bound workloads
# - Combined: 34% cost reduction at same performance
# CloudFormation / SAM
MyFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: my-api-handler
Runtime: python3.12
Architectures:
- arm64 # Enable Graviton3
MemorySize: 256
Timeout: 30
Handler: app.handler
---
# Terraform
resource "aws_lambda_function" "api" {
function_name = "api-handler"
runtime = "python3.12"
architectures = ["arm64"] # Graviton3
memory_size = 256
timeout = 30
# ...
}
# Migrate existing function to arm64
aws lambda update-function-configuration --function-name my-function --architectures arm64
# Must rebuild any native extensions for arm64
# Most Python/Node/Java/Ruby functions work without changes
# Cost comparison for 100M invocations/month at 256MB, 100ms avg:
# x86_64: 256MB/1024 × 0.1s × 100M × $0.0000166667 = $41.67
# arm64: 256MB/1024 × 0.1s × 100M × $0.0000133334 = $33.33
# Saving: $8.34/month per function (20%)
Batching SQS Messages to Lambda
# Without batching: 1 Lambda invocation per SQS message
# With batching: process up to 10,000 messages per invocation
resource "aws_lambda_event_source_mapping" "sqs_trigger" {
event_source_arn = aws_sqs_queue.orders.arn
function_name = aws_lambda_function.order_processor.arn
enabled = true
batch_size = 10000 # Max messages per batch
maximum_batching_window_in_seconds = 30 # Wait up to 30s to fill batch
# On partial failures: only re-process failed messages
function_response_types = ["ReportBatchItemFailures"]
scaling_config {
maximum_concurrency = 50 # Cap concurrent Lambda executions
}
}
# Cost impact of batching:
# Without batching: 10M messages/day = 10M Lambda invocations
# 10M × $0.0000002 = $2.00/day in request charges
# + Duration costs
#
# With batching (batch size 1000):
# 10M / 1000 = 10,000 Lambda invocations
# 10,000 × $0.0000002 = $0.002/day (1000x fewer requests!)
# + Duration costs (slightly higher per invocation, but far fewer)
Avoiding Lambda Anti-Patterns
# ANTI-PATTERN: Lambda calling Lambda in a loop
# Each inner Lambda invocation = separate cost + latency
def handler(event, context):
user_ids = event['user_ids'] # Could be 10,000 users
# WRONG: 10,000 Lambda invocations
for user_id in user_ids:
lambda_client.invoke(
FunctionName='process-user',
Payload=json.dumps({'user_id': user_id})
)
# BETTER PATTERN 1: Process in the same Lambda invocation
def handler(event, context):
user_ids = event['user_ids']
results = []
for user_id in user_ids:
results.append(process_user(user_id)) # Direct function call
return results
# BETTER PATTERN 2: Fan-out via SQS + batching
def handler(event, context):
user_ids = event['user_ids']
# Send all to SQS in batches of 10 (SQS SendMessageBatch limit)
for i in range(0, len(user_ids), 10):
batch = user_ids[i:i+10]
sqs.send_message_batch(
QueueUrl=os.environ['QUEUE_URL'],
Entries=[
{'Id': str(j), 'MessageBody': json.dumps({'user_id': uid})}
for j, uid in enumerate(batch)
]
)
# Worker Lambda processes with batch_size=1000 from SQS
Step Functions vs Lambda for Orchestration
Orchestration Cost Comparison (10M workflow executions/month):
OPTION 1: Lambda polling loop
Lambda runs every 30 seconds checking job status
= 2 invocations/minute × 60min × 24hr × 30days × 10M jobs
= Astronomically expensive (don't do this)
OPTION 2: Step Functions Standard Workflow
Cost: $0.025 per 1,000 state transitions
At 10 states per workflow × 10M executions:
= 100M transitions × $0.025/1000 = $2,500/month
OPTION 3: Step Functions Express Workflow
Cost: $1 per 1M workflow executions + duration
At 10M executions, 30s avg, 64MB:
= $10 execution + $0.00001667 × 30s × 64/1024 × 10M = $10 + $312 = $322/month
OPTION 4: EventBridge + SQS (fully async)
EventBridge: $1/1M events × 10M = $10
SQS: $0.40/1M messages × 10M = $4
Lambda: minimal invocations with batching
Total: ~$20/month
Rule: use Express Workflows for high-volume short workflows,
Standard Workflows for long-running complex orchestration,
EventBridge/SQS for simple fan-out patterns.
Lambda Storage and Ephemeral Storage
# Default ephemeral storage (/tmp): 512MB free
# Additional storage: $0.0000000309 per GB-second above 512MB
# For ML inference or video processing, you may need several GB
resource "aws_lambda_function" "video_processor" {
function_name = "video-processor"
runtime = "python3.12"
architectures = ["arm64"]
memory_size = 3008 # 3GB: max CPU allocation
timeout = 900 # 15 minutes max
ephemeral_storage {
size = 2048 # 2GB for video files (512 free + 1.5GB charged)
}
# Cost for 1.5GB extra storage over 15min:
# 1.5GB × 900s × $0.0000000309 = $0.0000417 per invocation
# At 100K video jobs/month: $4.17/month
# Compare to: S3 download + EFS mount (much more complex)
}
# Cost optimization: if files < 512MB, use default storage (free)
# Only provision extra storage for actual large-file workloads
Provisioned Concurrency: When It Helps and When It Doesn't
Provisioned Concurrency Cost vs Cold Start Impact:
Without Provisioned Concurrency:
Cold start rate: 1-5% of requests (new container spin-up)
Cold start duration: 500ms-3s (Python/Java: higher, Node: lower)
User impact: tail latency spikes
With Provisioned Concurrency (10 warm instances):
Extra cost: 10 × 256MB × 24hr × 30days × $0.0000000646/GB-s
= 10 × 0.25 × 86400s × 30 × $0.0000000646 = $41.99/month
Plus: Application Auto Scaling can adjust provisioned concurrency
based on schedule (cheaper than 24/7 warm instances)
When to use Provisioned Concurrency:
✅ Customer-facing API with p99 SLA (cold starts unacceptable)
✅ Java/C++ Lambda with 2-5s cold starts
✅ ML inference functions (model loading is slow)
❌ Background jobs (cold start latency doesn't matter)
❌ High-volume functions (rarely cold-started anyway)
❌ Functions invoked once per day (1 warm instance = small benefit)
Conclusion
Lambda cost optimization has multiple levers, and the best gains come from stacking them. Start with ARM64/Graviton3 (20% off immediately), run Lambda Power Tuning to find the optimal memory setting, implement SQS batching for queue-driven workloads, and avoid Lambda-calling-Lambda loops. For compute-heavy workloads, use Step Functions Express Workflows over polling loops.
A fully optimized Lambda stack — ARM64, right-sized memory, batched SQS events, and scheduled provisioned concurrency — typically costs 40-60% less than a default x86 Lambda with manual memory settings and per-message invocations.
Alex Thompson
CEO & Cloud Architecture Expert at ZeonEdge with 15+ years building enterprise infrastructure.