AWS Lambda

AWS Lambda is the serverless compute primitive on AWS. You ship a function — packaged as a zip, container image (up to 10 GB), or layer — and Lambda runs it on demand in response to events from 200+ AWS services and HTTP triggers, billing per millisecond of execution. SnapStart, ARM/Graviton runtimes, response streaming, function URLs, and Lambda@Edge extend the original event-driven model into low-latency APIs and globally-distributed compute.

Key Features:

Event-Driven Execution: Triggered by API Gateway, ALB, S3, DynamoDB Streams, Kinesis, SQS, EventBridge, SNS, MSK, Cognito, IoT, Step Functions, and dozens more.
Automatic Scaling: Concurrent invocations scale from zero to thousands per region with no provisioning; reserved or provisioned concurrency available for predictable latency.
Multiple Packaging Options: Zip uploads, container images (OCI), and shared layers for dependencies.
SnapStart: Snapshot-based cold-start acceleration for Java, Python, and .NET — up to 10x faster cold starts.
ARM/Graviton2 Runtimes: ~20% lower cost and improved performance versus x86.
Response Streaming: Stream up to 20 MB responses for LLM and large-payload use cases (function URLs and invoked-mode HTTPS).
Function URLs: Built-in HTTPS endpoints without API Gateway for simple webhooks and microservices.
Lambda@Edge & CloudFront Functions: Run code at CloudFront edge locations for request/response manipulation.
Ephemeral Storage (/tmp): Up to 10 GB per execution.
VPC Networking: Hyperplane ENIs for low-latency VPC access; supports IPv6 and private subnets.
Versioning & Aliases: Immutable versions and routable aliases (with weighted traffic shifting) for blue/green and canary releases.

Common Use Cases:

Serverless APIs: Lambda + API Gateway / function URLs / ALB for HTTP backends.
Event Processing: Process S3 uploads, DynamoDB Streams, Kinesis records, SQS messages, EventBridge events.
Real-Time File Processing: Resize images, transcode video, parse uploads as they arrive.
ETL Glue: Lightweight transforms inside Step Functions or as Firehose record transformers.
Cron / Scheduled Jobs: EventBridge Scheduler triggers Lambda for periodic work.
LLM Inference Wrappers: Stream Bedrock responses through Lambda function URLs to clients.
Edge Compute: Auth headers, A/B testing, redirects via Lambda@Edge or CloudFront Functions.

Service Limits & Quotas:

Memory: 128 MB to 10,240 MB in 1 MB increments; CPU scales linearly with memory.
Execution timeout: 1 second to 15 minutes (900 s).
Ephemeral storage (/tmp): 512 MB default, up to 10 GB.
Deployment package: 50 MB zipped, 250 MB unzipped (with layers); container images up to 10 GB.
Payload: 6 MB synchronous request/response, 256 KB async event.
Response streaming: up to 20 MB streamed.
Concurrent executions: default soft limit 1,000 per region; raise via Service Quotas.
Burst concurrency: 500–3,000 instant (region-dependent), then +500/minute scaling.
Environment variables: 4 KB total.

Pricing Model:

Requests: per million invocations (free tier of 1M/month).
Duration: per GB-second, billed at 1 ms granularity (free tier of 400,000 GB-seconds/month).
Provisioned concurrency: billed per GB-second of provisioned capacity, plus discounted invocation pricing.
SnapStart: additional charge for cached snapshot storage and per-restore.
Ephemeral storage: beyond 512 MB, additional GB-second charges.
Data transfer: standard AWS egress charges apply.
Common cost surprises: oversized memory (you pay for what you allocate), busy-loop polling of SQS, runaway event loops between Lambda and S3, provisioned concurrency left enabled in dev.

Code Example:


import json, os, boto3

dynamo = boto3.resource("dynamodb").Table(os.environ["TABLE_NAME"])

def lambda_handler(event, _ctx):
    """API Gateway proxy handler — store an event and return its id."""
    body = json.loads(event.get("body") or "{}")
    item = {
        "pk":      body["user_id"],
        "ts":      body["ts"],
        "payload": body.get("payload", {}),
    }
    dynamo.put_item(Item=item)
    return {
        "statusCode": 201,
        "headers":    {"content-type": "application/json"},
        "body":       json.dumps({"stored": True, "pk": item["pk"]}),
    }

Deploy and grant DynamoDB write access via AWS CLI:


zip -r function.zip handler.py

aws lambda create-function \
  --function-name store-event \
  --runtime python3.12 \
  --architectures arm64 \
  --memory-size 512 \
  --timeout 10 \
  --handler handler.lambda_handler \
  --role arn:aws:iam::123456789012:role/lambda-store-event \
  --zip-file fileb://function.zip \
  --environment 'Variables={TABLE_NAME=events}' \
  --snap-start ApplyOn=PublishedVersions

Common Interview Questions:

What causes Lambda cold starts and how do you mitigate them?

A cold start happens when Lambda spins up a new execution environment — pulling code, initializing the runtime, and running module-level init. Mitigations: provisioned concurrency, SnapStart (Java/Python/.NET), smaller deployment packages, ARM/Graviton, lazy imports, and avoiding heavy init.

How does Lambda concurrency work?

Each concurrent invocation needs its own execution environment. Account-wide concurrency starts at 1,000 (soft limit). Reserved concurrency caps a function's max concurrency (and reserves it). Provisioned concurrency keeps N pre-warmed environments at additional cost.

When would you choose Step Functions over chaining Lambdas directly?

Direct chaining works for simple flows but lacks built-in retries, error handling, observability, long-running waits, and parallelism. Step Functions adds explicit state, durable execution (up to a year), and visual debugging.

How does Lambda integrate with SQS, Kinesis, and DynamoDB Streams?

Via Event Source Mappings (ESM). Lambda polls the source on your behalf and invokes the function with batches. Tunables include batch size, batch window, parallelization factor (Kinesis/DynamoDB), and on-failure destinations.

How does memory affect Lambda performance and cost?

CPU and network bandwidth scale linearly with memory. A function may run faster (and cheaper overall) at 1024 MB than at 256 MB if it is CPU-bound. Use Lambda Power Tuning to find the right setting empirically.

What is SnapStart and what are its limitations?

SnapStart takes a Firecracker microVM snapshot after init and restores it on invocation, eliminating most cold-start cost. Available for Java, Python, and .NET. Limitations: requires versioned functions, networking state must be re-initialized post-restore, not compatible with provisioned concurrency on the same alias.

How do you keep VPC-attached Lambdas fast?

Lambda uses Hyperplane ENIs that are shared across invocations, so VPC cold-start penalty is small. Use VPC endpoints for AWS service calls and put functions in private subnets with NAT only when necessary.

AWS Lambda is the default serverless compute on AWS — pay-per-millisecond, scales to zero, and integrates with virtually every AWS service. With SnapStart, ARM, and response streaming, modern Lambda is competitive even with low-latency and large-payload workloads that previously needed always-on services.