AWS S3 Complete Guide 2026: Storage for Every Use Case

In This Guide

What Is S3 and Why Does Everyone Use It
Core Concepts: Buckets, Objects, Keys, and Versions
S3 Storage Classes: Choosing the Right One
S3 Security: Bucket Policies, ACLs, and Block Public Access
Lifecycle Policies: Automate Cost Savings
Real Use Cases: What Developers Actually Store in S3
Static Website Hosting with S3
Frequently Asked Questions

Key Takeaways

Infinitely scalable: S3 stores any amount of data — from a single file to exabytes. There is no capacity planning, no storage provisioning, and no size limits per bucket.
11 nines durability: AWS guarantees 99.999999999% durability for S3 Standard. AWS automatically stores each object redundantly across multiple Availability Zones.
Storage classes matter: Moving infrequently accessed data from S3 Standard ($0.023/GB/month) to S3 Glacier Instant Retrieval ($0.004/GB/month) is an 83% cost reduction. Lifecycle policies automate this.
Security requires action: S3 buckets are private by default, but Block Public Access must be explicitly enabled to prevent accidental exposure. Always enable it at the account level.

S3 is the most used service in all of AWS, and it is the one most developers encounter first. It stores the images your web app serves, the backups your RDS database creates, the deployment artifacts your CI/CD pipeline produces, the access logs your API Gateway generates, and the static files your React app needs to load.

S3 looks simple — it is a bucket, you put files in it — but there are enough features, storage classes, security settings, and pricing nuances that a guide is worth reading before you touch production. Getting S3 wrong means either paying too much (wrong storage class, no lifecycle policies) or getting breached (misconfigured permissions, public buckets).

What Is S3 and Why Does Everyone Use It

Amazon S3 (Simple Storage Service) is an object storage service that stores any type of file — called an object — at any scale, with 99.999999999% durability, starting at $0.023 per GB per month. It is the foundation of the AWS ecosystem because virtually every other AWS service either reads from or writes to S3.

Unlike a file system (where you navigate a directory tree) or a block storage device (where you have a disk mounted to a server), S3 is an object store. Each object is identified by a unique key (effectively a file path), stored in a bucket (a named container), and accessible via an HTTPS URL or the AWS API.

∞

Unlimited Scale

No capacity limits. One bucket can hold one file or one trillion files. You never provision storage.

🔒

Durability by Default

AWS automatically stores each object redundantly across multiple AZs. No replication setup required.

⚙

Native AWS Integration

Lambda, RDS, Athena, CloudFront, CodeBuild — dozens of services have native S3 integration built in.

💲

Cost-Effective

$0.023/GB/month for Standard. Glacier Deep Archive is $0.00099/GB — sub-$1 per terabyte.

Core Concepts: Buckets, Objects, Keys, and Versions

Buckets are the top-level containers for S3 objects. Bucket names must be globally unique across all AWS accounts. Bucket names are part of the S3 URL: https://my-bucket.s3.amazonaws.com/. Each bucket lives in one AWS region.

Objects are the individual files stored in S3. Each object can be up to 5 TB. Objects consist of the data itself plus metadata — the object's key, size, last modified timestamp, storage class, and any custom metadata you add.

Keys are the unique identifiers for objects within a bucket. Keys look like file paths (uploads/2026/03/photo.jpg) but S3 is a flat namespace — there are no real directories. The "/" character is a convention that tools use to display objects in a folder-like structure.

Versioning is an optional feature that stores every version of every object. When versioning is enabled, deleting an object adds a delete marker rather than removing the data. Enable versioning on buckets that store important data that must be recoverable from accidental deletion or overwrite.

S3 URL Patterns

# Path-style URL (being deprecated)
https://s3.amazonaws.com/bucket-name/object-key

# Virtual-hosted-style URL (current standard)
https://bucket-name.s3.region.amazonaws.com/object-key

# CloudFront CDN URL (for production web assets)
https://d1234567890.cloudfront.net/object-key

S3 Storage Classes: Choosing the Right One

S3 offers eight storage classes with different pricing tiers for storage cost, retrieval cost, and retrieval latency. Choosing the right class for each type of data can reduce S3 storage costs by 60-90%.

Storage Class	Use Case	Cost/GB/mo	Retrieval	Min Duration
S3 Standard	Frequently accessed data	$0.023	Immediate, free	None
S3 Intelligent-Tiering	Unknown access patterns	$0.023 + monitoring	Immediate	None
S3 Standard-IA	Infrequently accessed, fast retrieval	$0.0125	Immediate, $0.01/GB	30 days
S3 One Zone-IA	Infrequent, non-critical, single AZ	$0.01	Immediate, $0.01/GB	30 days
S3 Glacier Instant	Archive with millisecond retrieval	$0.004	Milliseconds, $0.03/GB	90 days
S3 Glacier Flexible	Archive, 1-12 hour retrieval	$0.0036	Minutes to hours	90 days
S3 Glacier Deep Archive	Long-term archive	$0.00099	12-48 hours	180 days

"Moving infrequently accessed data from S3 Standard to S3 Glacier Instant Retrieval is an 83% cost reduction on storage — with millisecond retrieval still available when you need it."

S3 Cost Optimization Pattern

S3 Intelligent-Tiering monitors access patterns and automatically moves objects between Standard and IA tiers based on actual usage. It adds a small per-object monitoring fee ($0.0025 per 1,000 objects per month) but can be the right choice when access patterns are genuinely unpredictable.

S3 Security: Bucket Policies, ACLs, and Block Public Access

Every S3 bucket should have Block Public Access enabled at the account level unless you are intentionally serving public static content. From there, use bucket policies to grant specific cross-account or service access, and IAM roles to grant application access. Never make individual objects public unless they are intended to be publicly accessible.

What Block Public Access Does

Four Settings That Override ACLs

Blocks new public ACLs from being set
Ignores existing public ACLs on objects
Blocks new bucket policies allowing public access
Ignores public policies on existing buckets

Recommended Architecture

S3 Private + CloudFront OAC

S3 bucket stays private (Block Public Access ON)
CloudFront distribution with Origin Access Control
Public web traffic through CloudFront only
Prevents direct S3 URL access bypassing CDN

Bucket policies are JSON documents attached to a bucket that grant or deny access to principals (AWS accounts, IAM roles, AWS services). This policy allows CloudFront (with OAC) to read objects:

JSON — S3 Bucket Policy (CloudFront OAC)

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Service": "cloudfront.amazonaws.com"
    },
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::my-bucket/*",
    "Condition": {
      "StringEquals": {
        "AWS:SourceArn": "arn:aws:cloudfront::123456789:distribution/DIST_ID"
      }
    }
  }]
}

Lifecycle Policies: Automate Cost Savings

S3 Lifecycle policies automatically transition objects between storage classes or delete them after a defined period. A well-configured lifecycle policy is the easiest way to reduce S3 costs without any ongoing maintenance.

Example lifecycle policy for an application log bucket:

After 30 days: transition to S3 Standard-IA (access is infrequent after the first month)
After 90 days: transition to S3 Glacier Flexible Retrieval (logs over 90 days old are rarely accessed)
After 365 days: delete (logs over 1 year old have no business value)

For a bucket storing 1 TB of logs that grows by 50 GB/month, this policy can reduce annual S3 costs by 60-70% compared to leaving all logs in Standard storage.

HCL — Terraform S3 Lifecycle Rule

resource "aws_s3_bucket_lifecycle_configuration" "logs" {
  bucket = aws_s3_bucket.logs.id
  rule {
    id     = "log_rotation"
    status = "Enabled"
    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }
    transition {
      days          = 90
      storage_class = "GLACIER"
    }
    expiration { days = 365 }
  }
}

Real Use Cases: What Developers Actually Store in S3

📷

User-Uploaded Media

Images, videos, documents. S3 + CloudFront is the standard architecture. Use pre-signed URLs for direct-upload from the browser.

💾

Database Backups

RDS automated backups go to S3. Lifecycle policies archive old backups to Glacier. Cross-region replication adds disaster recovery.

📊

Data Lake Foundation

Raw CSV, JSON, and Parquet files from operational systems. Athena queries directly with SQL. No managed cluster required.

🚀

CI/CD Artifacts

Build artifacts from CodeBuild, compiled Lambda packages, Docker image layers (ECR uses S3 under the hood).

Static Website Hosting with S3

S3 static website hosting serves HTML files directly from a bucket. Combined with CloudFront for HTTPS and CDN caching, and Route 53 for custom domain routing, this is the standard way to host static sites and single-page applications at essentially zero cost for low-traffic sites.

Setup steps:

Create an S3 bucket with the same name as your domain (e.g., www.example.com)
Upload your build output (dist/ or build/ directory) to the bucket
Create a CloudFront distribution with the S3 bucket as the origin, using Origin Access Control
Request an ACM certificate for your domain (free, auto-renewing)
Attach the certificate to your CloudFront distribution
Create a Route 53 alias record pointing your domain to the CloudFront distribution

Cost for a static site with 10,000 visitors/month: approximately $0.50-$2.00/month total (S3 storage + CloudFront data transfer + Route 53 hosted zone). Compare to $5-25/month for a managed hosting platform doing the same thing.

For React/Vue/Angular apps, add a CloudFront error page rule to redirect all 404s back to /index.html. This enables client-side routing to work correctly when a user bookmarks a deep link and navigates directly to it.

Frequently Asked Questions

How much does S3 storage cost?

S3 Standard costs $0.023 per GB per month in the US East (N. Virginia) region. Storing 100 GB costs $2.30/month. Data transfer out to the internet costs $0.09 per GB after the first 1 GB free per month. PUT, GET, and other request costs are fractions of a cent per 1,000 requests. Most small applications spend under $5/month on S3.

Is S3 a database?

No. S3 is an object store, not a database. You cannot query S3 objects with SQL (though Athena can query structured files stored in S3). S3 is optimized for storing and retrieving whole files, not for individual record lookups, updates, or transactions. Use DynamoDB or RDS for application data and S3 for files.

How do I make an S3 object publicly accessible?

Disable the bucket-level Block Public Access setting, then either set the object ACL to public-read or add a bucket policy that allows s3:GetObject for all principals (Principal: *). For serving web assets publicly, the recommended approach is to keep the bucket private and use CloudFront with Origin Access Control.

What is the maximum file size in S3?

A single S3 object can be up to 5 TB. Files larger than 100 MB should be uploaded using the multipart upload API, which splits the file into parts (5 MB to 5 GB each) and uploads them in parallel. The AWS CLI and SDKs handle multipart uploads automatically for large files.

Verdict: Master S3 Early and Everything Else Gets Easier

S3 is the one AWS service that appears in almost every architecture diagram. Master the storage class selection (use Intelligent-Tiering when in doubt), lock down permissions with Block Public Access and IAM roles, set lifecycle policies from day one, and use CloudFront in front of everything you serve publicly. These four habits turn S3 from a potential liability into one of the most cost-efficient and reliable storage systems available anywhere.

S3 is the foundation of every AWS architecture. Get the skills.

Join professionals from Denver, NYC, Dallas, LA, and Chicago for a 2-day in-person AI training bootcamp. $1,490. June–October 2026 (Thu–Fri). Seats are limited.

Reserve Your Seat

Our Take

S3 is the most unsung piece of the AI infrastructure stack — your data pipeline lives or dies here.

Most AI infrastructure discussions focus on compute — which GPU, which inference endpoint, which model. The less glamorous truth is that data ingestion, staging, and movement through S3 is where many production AI pipelines spend most of their wall-clock time and a surprising fraction of their cost. S3 GET request pricing, data transfer costs, and the latency difference between S3 Standard and S3 Express One Zone are often larger line items in an AI application's AWS bill than the inference compute itself, particularly for document-processing and RAG-style workloads that read large volumes of files.

S3 Express One Zone, launched in late 2023, is still underutilized. It offers 10x lower latency than standard S3 for hot-data access at roughly 5x the storage cost per GB. For AI applications that repeatedly access the same training datasets, embedding caches, or chunked document stores, that latency reduction compounds through the entire pipeline. The pricing premium pays for itself quickly if your workload is read-heavy. Most developers default to Standard S3 for everything without evaluating whether Express One Zone fits their hot-path data patterns.

The S3 skill that will save you money in an AI context: understand S3 Select and S3 Object Lambda. S3 Select lets you query inside Parquet or CSV files without reading the whole object, which can reduce data transfer by 90% for columnar filtering. That's a meaningful cost reduction for any AI application that reads structured datasets from S3.

Published By

Precision AI Academy

Practitioner-focused AI education · 2-day in-person bootcamp in 5 U.S. cities

Precision AI Academy publishes deep-dives on applied AI engineering for working professionals. Founded by Bo Peng (Kaggle Top 200) who leads the in-person bootcamp in Denver, NYC, Dallas, LA, and Chicago.

Kaggle Top 200Federal AI Practitioner5 U.S. CitiesThu–Fri Cohorts