AWS App Runner in 2026: Deploy Web Apps Without Managing Servers

In This Article

  1. What App Runner Is and the Problem It Solves
  2. App Runner vs ECS vs Lambda vs Elastic Beanstalk vs Render
  3. How to Deploy a Container from ECR or GitHub
  4. Auto Scaling and Concurrency Settings
  5. Custom Domains and HTTPS
  6. Connecting to RDS, DynamoDB, and Secrets Manager
  7. App Runner for AI APIs: Deploying FastAPI with an ML Model
  8. Pricing: When App Runner Saves Money vs When It Does Not
  9. Observability: CloudWatch Logs and Metrics
  10. Real-World Use Case: Node.js REST API in Production
  11. Frequently Asked Questions

Key Takeaways

Most developers building web apps in 2026 do not want to be systems administrators. They want to write code, push it, and have something running in production. That is precisely the gap AWS App Runner was built to fill — and in the three years since it reached general availability, it has become one of the most practical container deployment tools on AWS for teams that need production-grade infrastructure without a dedicated DevOps engineer.

This guide covers everything you need to actually use App Runner in production: how it compares to the alternatives, step-by-step deployment from both ECR and GitHub, scaling configuration, database connections, AI API hosting, pricing math, and observability. By the end you will know exactly when App Runner is the right tool and when to reach for something else.

What App Runner Is and the Problem It Solves

AWS App Runner is a fully managed container deployment service — you give it a Docker image or a GitHub repository, configure CPU, memory, and scaling settings, and within about two minutes you have a live HTTPS endpoint with load balancing, health checks, and auto scaling, requiring zero configuration of load balancers, task definitions, security groups, or certificates. You configure CPU and memory, set minimum and maximum instance counts, optionally wire up a custom domain, and AWS handles everything else.

The problem it solves is specific: the gap between "I have a Dockerized app" and "it is running reliably in production" is larger than it looks. Even with managed services like ECS on Fargate, you need to configure task definitions, service auto scaling policies, Application Load Balancers, target groups, security groups, IAM task roles, and CloudWatch alarms. That is eight to twelve distinct AWS resources and a non-trivial amount of configuration for what is, at its core, just "run this container and put it on the internet."

~15
AWS resources required to manually deploy a container on ECS with ALB, auto scaling, and HTTPS
App Runner reduces that to a single service configuration with no ALB, target group, or task definition required

App Runner is not a replacement for ECS or Kubernetes in all cases. It trades configurability for simplicity. You cannot run scheduled tasks, access GPU instances, use custom VPC routing, or configure layer-4 protocols. But for the most common use case — a stateless HTTP service that needs to handle variable traffic — App Runner removes a significant amount of friction.

~2min
Average time from ECR push to live HTTPS endpoint
0
Load balancers, task definitions, or target groups to configure
$0
Additional cost for TLS, health checks, and automatic deployments

What App Runner Is Not

App Runner is not a general-purpose compute platform. It does not run background workers, cron jobs, or long-running batch processes well. It is stateless by design — each request should be handled independently, and you should not rely on local file system state persisting between requests. If your app writes to local disk and reads it back, App Runner will disappoint you. Put that data in S3 or RDS instead.

It also does not support WebSockets natively at the time of writing. If you need long-lived connections — real-time chat, live data streaming — you need a different solution. For everything else that is an ordinary request-response HTTP service, App Runner is a serious option worth evaluating.

App Runner vs ECS vs Lambda vs Elastic Beanstalk vs Render

App Runner is the right choice over ECS when you do not need GPU, custom VPC routing, or service mesh complexity; over Lambda when your service needs persistent in-memory state (like a loaded ML model) or runs longer than 15 minutes; and over Elastic Beanstalk when you want a modern container workflow without SSH-ing into EC2 instances. Here is an honest comparison of the options developers commonly consider in 2026.

Service Setup Complexity Auto Scale to Zero Container Support Custom Domain Cold Start Best For
App Runner Very Low Partial (pause) Native Yes ~5s when paused APIs, microservices, internal tools
ECS on Fargate High No Native Via ALB None Complex microservices, GPU, batch
AWS Lambda Medium Yes Via image (15 min) Via API GW 100ms–3s Event-driven, low-volume APIs
Elastic Beanstalk Medium No Yes Yes None Legacy apps, EC2-level control
Render Very Low Yes (free tier) Native Yes ~30s on free tier Side projects, prototypes, startups

When to Pick App Runner Over the Alternatives

How to Deploy a Container from ECR or GitHub

App Runner supports two deployment paths: ECR (you build and push the image, App Runner deploys it — with optional automatic re-deploy on every new push) and GitHub (App Runner builds from your Dockerfile directly, triggered on every branch push). The ECR path is recommended for production because your CI pipeline controls image quality before deployment.

Deploying from Amazon ECR

Deploying from GitHub

The GitHub path is useful when you want App Runner to own the build step. Connect your GitHub account, select the repository and branch, and App Runner looks for a Dockerfile at the root. You can also provide an apprunner.yaml configuration file for more control over build commands, runtime environment, and start commands.

apprunner.yaml — GitHub source configuration
version: 1.0 runtime: nodejs18 build: commands: build: - npm ci - npm run build run: runtime-version: 18 command: node dist/server.js network: port: 3000 env: PORT env: - name: NODE_ENV value: production

With automatic deployments enabled on the GitHub source, every push to the configured branch triggers a new build and deploy. App Runner runs the old version until the new one passes its health check, then cuts traffic over — zero-downtime deploys with no configuration required.

Auto Scaling and Concurrency Settings

App Runner scales on concurrent requests per instance — set concurrency to 100 for I/O-bound HTTP APIs, 10–25 for CPU-bound inference. Set min instances to 1 (adds ~$50/month) to eliminate cold starts on production services, or min 0 to scale to zero for internal tools and dev environments where 5-second cold starts are acceptable.

Auto scaling configuration — key settings
Min instances: 1 # Keep at least 1 instance warm (no cold starts) Max instances: 10 # Hard ceiling on scale-out Concurrency per inst: 100 # Requests per instance before scaling triggers # With 1 min instance and 100 concurrency: # - 1–100 concurrent requests → 1 instance # - 101–200 concurrent requests → 2 instances # - 901–1000 concurrent requests → 10 instances

Min Instances: The Decision That Determines Your Bill

Setting min instances to 0 means App Runner pauses the service when there are no active requests. This eliminates idle compute costs but introduces a ~5 second cold start when the first request arrives after a quiet period. For internal tools or dev environments, this is acceptable. For production APIs with SLA requirements, set min to 1.

Setting min instances to 1 keeps one instance always running. At 1 vCPU / 2 GB, this costs roughly $50–55/month at the current rate. That is the floor for a production App Runner service with no cold starts.

The concurrency setting is where most developers underestimate App Runner's capabilities. A well-written Node.js or Python async service can handle hundreds of concurrent requests on a single vCPU using event-loop concurrency — I/O-bound operations (database queries, external API calls) do not block other requests. Setting concurrency to 100 per instance is reasonable for most HTTP APIs. CPU-bound workloads (ML inference, image processing) should use lower concurrency (10–25) and more instances.

Custom Domains and HTTPS

Adding a custom domain to App Runner takes under five minutes: navigate to Custom Domains in the console, enter your domain, add the generated CNAME and certificate validation records to your DNS provider, and your custom domain serves HTTPS automatically once DNS propagates — no Certificate Manager setup or ACM configuration required.

  1. In the App Runner console, navigate to your service and click "Custom domains."
  2. Enter your domain (e.g., api.yourapp.com) and click "Add domain." App Runner generates the CNAME and certificate validation records you need to add to your DNS.
  3. Add those records in your DNS provider (Route 53, Cloudflare, etc.). Once DNS propagates and the certificate validates — usually 5–15 minutes — your custom domain serves traffic over HTTPS automatically.

Cloudflare Users: Set Proxy to DNS-Only

If your domain DNS is managed by Cloudflare, set the CNAME record to "DNS only" (gray cloud, not orange cloud) when pointing to App Runner. Proxying through Cloudflare while App Runner also handles TLS can cause certificate validation failures. Once the App Runner certificate is issued, you can re-enable Cloudflare's proxy if desired — but DNS-only is simpler and still gets you App Runner's built-in TLS.

Connecting to RDS, DynamoDB, and Secrets Manager

Connecting to RDS

RDS instances live inside a VPC. By default, App Runner runs outside your VPC. To give an App Runner service access to RDS, you configure a VPC Connector — essentially a set of subnets and security groups that App Runner uses when making outbound connections.

  1. In the App Runner console under "Networking," select "Custom VPC" and choose or create a VPC Connector. Select the private subnets where your RDS instance is accessible.
  2. Attach a security group that is allowed inbound on port 5432 (PostgreSQL) or 3306 (MySQL) by your RDS security group's rules.
  3. Update your RDS security group to allow inbound traffic from the App Runner VPC Connector's security group.

Your database connection string then uses the private RDS endpoint (e.g., mydb.cluster-xyz.us-east-1.rds.amazonaws.com). Never hardcode credentials — use Secrets Manager (covered below).

Connecting to DynamoDB

DynamoDB is a public AWS service — no VPC configuration needed. Grant access by attaching an IAM policy to your App Runner service's instance role:

IAM policy — DynamoDB access for App Runner instance role
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "dynamodb:GetItem", "dynamodb:PutItem", "dynamodb:UpdateItem", "dynamodb:DeleteItem", "dynamodb:Query", "dynamodb:Scan" ], "Resource": "arn:aws:dynamodb:us-east-1:123456789:table/your-table-name" } ] }

The AWS SDK inside your container automatically picks up credentials from the instance metadata service — no access keys in environment variables, no credential files. This is the correct pattern for all AWS service access from App Runner.

Secrets Manager Integration

Store database passwords, API keys, and other secrets in AWS Secrets Manager. In your App Runner service configuration, you can reference a secret as an environment variable — App Runner fetches the secret value at startup and injects it, so your code reads it like any other environment variable and the secret never appears in your container definition.

App Runner environment variable backed by Secrets Manager
# In App Runner service configuration (console or IaC): Environment variable name: DATABASE_URL Value source: Secret Secret ARN: arn:aws:secretsmanager:us-east-1:123456789:secret:prod/db-url-AbCdEf # In your Node.js app — no changes needed: const db = new Pool({ connectionString: process.env.DATABASE_URL })

App Runner for AI APIs: Deploying FastAPI with an ML Model

App Runner is better than Lambda for AI inference services that load a model into memory — a 500MB model loaded at startup stays resident across all requests, while Lambda would reload it on every cold start, adding 3–8 seconds of latency. Configure min instances to 1, memory to 2–4 GB, and concurrency to 10–25 for CPU-bound inference. This pattern works well for text classification, embedding generation, and smaller NLP models with variable traffic.

The key advantage: unlike Lambda, App Runner keeps your container running between requests. This means a 500 MB model loaded into memory stays loaded. Lambda would reload it on every cold start. For inference services where load time is measured in seconds, that difference matters enormously.

FastAPI inference service — app.py
from fastapi import FastAPI from sentence_transformers import SentenceTransformer import numpy as np app = FastAPI() # Model loads once at startup — stays in memory across requests model = SentenceTransformer("all-MiniLM-L6-v2") @app.get("/health") def health(): return {"status": "ok"} @app.post("/embed") def embed(payload: dict): texts = payload.get("texts", []) embeddings = model.encode(texts) return {"embeddings": embeddings.tolist()} # Dockerfile: FROM python:3.11-slim # RUN pip install fastapi uvicorn sentence-transformers # CMD uvicorn app:app --host 0.0.0.0 --port 8080

App Runner Configuration for AI Inference Services

For services that proxy calls to OpenAI, Anthropic, or Bedrock rather than running a local model, App Runner is even better suited — these calls are purely I/O-bound, latency is dominated by the upstream API, and you can safely set concurrency to 100+ per instance.

Pricing: When App Runner Saves Money vs When It Does Not

App Runner costs roughly $53/month for one always-on 1 vCPU/2 GB instance (the production default), approximately $13/month for 0.25 vCPU/0.5 GB, and near zero if you set min instances to 0 and accept ~5-second cold starts. It becomes more expensive than ECS Fargate at sustained high traffic — run the numbers if your service handles more than 10 million requests per month.

Configuration Active Rate Idle Rate ~Monthly (always-on) ~Monthly (8h/day active)
0.25 vCPU / 0.5 GB $0.016/hr + $0.0018/hr $0.00125/hr + $0.00014/hr ~$13 ~$5
0.5 vCPU / 1 GB $0.032/hr + $0.0035/hr $0.0025/hr + $0.00028/hr ~$26 ~$10
1 vCPU / 2 GB $0.064/hr + $0.007/hr $0.005/hr + $0.00056/hr ~$53 ~$20
2 vCPU / 4 GB $0.128/hr + $0.014/hr $0.010/hr + $0.00112/hr ~$105 ~$40

When App Runner Does NOT Save Money

When App Runner DOES Save Money

Observability: CloudWatch Logs and Metrics

App Runner automatically sends logs to CloudWatch in two groups: application logs (/aws/apprunner/{service}/{id}/application for your container stdout/stderr) and system logs (/aws/apprunner/{service}/{id}/service for deployments and scaling events). Set alarms on 5xx error rate above 1% and P99 latency above your SLA threshold.

CloudWatch log group names
# Application logs (your container output) /aws/apprunner/{service-name}/{service-id}/application # System logs (deployments, scaling, health checks) /aws/apprunner/{service-name}/{service-id}/service

App Runner also publishes metrics to CloudWatch Metrics under the AWS/AppRunner namespace. The most useful ones for building dashboards and alarms:

Recommended Alarms for Any Production App Runner Service

For structured logging, write JSON to stdout and parse it in CloudWatch Logs Insights. App Runner does not add any parsing overhead — it captures whatever your application writes to stdout verbatim. JSON logs let you query fields directly: fields @timestamp, requestId, statusCode, latencyMs | sort @timestamp desc | filter statusCode >= 500.

Real-World Use Case: Node.js REST API in Production on App Runner

Here is a concrete example of an architecture that works well on App Runner: a Node.js REST API serving a B2B SaaS product, with PostgreSQL on RDS and DynamoDB for session data.

The Stack

GitHub Actions — build and push to ECR on main branch push
name: Deploy to App Runner on: push: branches: [main] jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::123456789:role/github-actions-ecr aws-region: us-east-1 - name: Login to Amazon ECR id: login-ecr uses: aws-actions/amazon-ecr-login@v2 - name: Build, tag, and push image to ECR env: ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }} IMAGE_TAG: ${{ github.sha }} run: | docker build -t $ECR_REGISTRY/my-api:$IMAGE_TAG . docker push $ECR_REGISTRY/my-api:$IMAGE_TAG docker tag $ECR_REGISTRY/my-api:$IMAGE_TAG $ECR_REGISTRY/my-api:latest docker push $ECR_REGISTRY/my-api:latest # App Runner detects new :latest push and redeploys automatically

What This Architecture Costs

In practice, this configuration costs approximately $55–70/month for the App Runner service itself, plus RDS costs (a db.t3.micro PostgreSQL instance runs about $15–20/month with minimal traffic). Total infrastructure cost for a production REST API: under $100/month until you need multiple instances. That is a number that makes sense for any early-stage product.

"The right question isn't 'which service is cheapest' — it's 'which service lets two engineers ship a production API without being distracted by infrastructure.' For most teams, that answer is App Runner."

Lessons From Production

A few things worth knowing before you go to production on App Runner. First, container startup time matters. If your Node.js app takes 8 seconds to start (loading config, establishing DB connections, warming up caches), and App Runner's health check timeout is 20 seconds, you have a narrow window. Keep startup time under 10 seconds and set your health check start period to 30 seconds to give the container time to come up before App Runner starts checking it.

Second, Prisma's connection pooling needs attention in a horizontally scaled environment. Each App Runner instance opens its own connection pool. If you scale to 5 instances with a pool size of 10, that's 50 connections to RDS — more than a db.t3.micro can handle. Use PgBouncer or Prisma Accelerate for connection pooling in front of RDS when you expect more than 2–3 instances.

Third, App Runner rolls back automatically if a new deployment fails health checks. This is genuinely useful — a bad deploy does not take down production. But it also means you need a real health check endpoint that exercises your app meaningfully (verifies DB connectivity, not just returns 200 unconditionally) to catch broken deployments before they go live.

Deploy Real Apps in the Bootcamp

Reading about App Runner and actually deploying a production service are different experiences. The gap between understanding the concepts and having the muscle memory to debug a failed deployment, configure VPC Connectors correctly, or trace a P99 latency spike through CloudWatch — that gap closes with hands-on practice.

What You Deploy in Three Days at Precision AI Academy

Bootcamp Details

Ship your next API this week.

Three days. Five cities. Real AWS deployments, AI integrations, and the cloud engineering skills that close the gap between developer and architect. Reserve your seat — $1,490, small cohort, hands-on from hour one.

Reserve Your Seat

Denver · Los Angeles · New York City · Chicago · Dallas · October 2026

The bottom line: AWS App Runner is the fastest path from a Docker container to a production HTTPS endpoint on AWS — about two minutes from ECR push to live service, with zero load balancer, task definition, or certificate configuration. It costs roughly $50/month per always-on 1 vCPU service, which pays for itself in DevOps time saved on the first incident. Use it for APIs, microservices, and AI inference services with variable traffic; switch to ECS Fargate when sustained high traffic, GPU, or complex networking requirements outgrow what App Runner can configure.

Frequently Asked Questions

What is AWS App Runner and when should I use it?

AWS App Runner is a fully managed container deployment service that handles provisioning, load balancing, scaling, and TLS termination automatically. You bring a container image or source code repository, set a few configuration values, and App Runner handles the rest. Use it when you need production-grade deployment for an HTTP service or API without the overhead of managing EC2 instances, ECS clusters, or Kubernetes. It is particularly well-suited for teams without dedicated DevOps resources, internal tools, microservices with variable traffic, and AI inference APIs.

How does AWS App Runner pricing compare to ECS and Lambda?

App Runner charges $0.064 per vCPU-hour and $0.007 per GB-hour for active compute, plus a lower rate for idle (provisioned) instances. For a small always-on 1 vCPU / 2 GB service, expect roughly $50–60 per month. ECS on Fargate has similar per-unit compute costs but adds cluster management overhead. Lambda is cheaper at very low request volumes but becomes expensive with sustained traffic — for services handling 100+ concurrent requests continuously, App Runner's flat-rate model often wins on total cost including engineering time.

Can AWS App Runner connect to RDS and DynamoDB?

Yes. App Runner connects to RDS through a VPC Connector, which places your service inside the VPC where RDS lives. DynamoDB requires only that your App Runner instance role has the appropriate DynamoDB IAM permissions — no VPC configuration needed. Secrets Manager integration injects database credentials as environment variables at startup, so no credentials appear in your source code or container images. This is the correct and secure pattern for all database connectivity from App Runner.

Is AWS App Runner suitable for hosting AI and ML APIs?

App Runner is well-suited for AI inference APIs built with FastAPI, Flask, or Express — particularly for models that fit in memory and have variable traffic. The key advantage over Lambda: your container stays running between requests, so a loaded ML model stays in memory. For CPU-based inference (embedding models, text classification, smaller NLP models under ~1 GB), App Runner is practical and cost-effective. For GPU-dependent or very large model workloads, use SageMaker inference endpoints or EC2 G instances instead.

Sources: AWS Documentation, Gartner Cloud Strategy, CNCF Annual Survey

BP

Bo Peng

AI Instructor & Founder, Precision AI Academy

Bo has trained 400+ professionals in applied AI across federal agencies and Fortune 500 companies. Former university instructor specializing in practical AI tools for non-programmers. Kaggle competitor and builder of production AI systems. He founded Precision AI Academy to bridge the gap between AI theory and real-world professional application.

Explore More Guides