In This Guide
- What Is S3 and Why Does Everyone Use It
- Core Concepts: Buckets, Objects, Keys, and Versions
- S3 Storage Classes: Choosing the Right One
- S3 Security: Bucket Policies, ACLs, and Block Public Access
- Lifecycle Policies: Automate Cost Savings
- Real Use Cases: What Developers Actually Store in S3
- Static Website Hosting with S3
- Frequently Asked Questions
Key Takeaways
- Infinitely scalable: S3 stores any amount of data — from a single file to exabytes. There is no capacity planning, no storage provisioning, and no size limits per bucket.
- 11 nines durability: AWS guarantees 99.999999999% durability for S3 Standard. AWS automatically stores each object redundantly across multiple Availability Zones.
- Storage classes matter: Moving infrequently accessed data from S3 Standard ($0.023/GB/month) to S3 Glacier Instant Retrieval ($0.004/GB/month) is an 83% cost reduction. Lifecycle policies automate this.
- Security requires action: S3 buckets are private by default, but Block Public Access must be explicitly enabled to prevent accidental exposure. Always enable it at the account level.
S3 is the most used service in all of AWS, and it is the one most developers encounter first. It stores the images your web app serves, the backups your RDS database creates, the deployment artifacts your CI/CD pipeline produces, the access logs your API Gateway generates, and the static files your React app needs to load.
S3 looks simple — it is a bucket, you put files in it — but there are enough features, storage classes, security settings, and pricing nuances that a guide is worth reading before you touch production. Getting S3 wrong means either paying too much (wrong storage class, no lifecycle policies) or getting breached (misconfigured permissions, public buckets).
What Is S3 and Why Does Everyone Use It
Amazon S3 (Simple Storage Service) is an object storage service that stores any type of file — called an object — at any scale, with 99.999999999% durability, starting at $0.023 per GB per month. It is the foundation of the AWS ecosystem because virtually every other AWS service either reads from or writes to S3.
Unlike a file system (where you navigate a directory tree) or a block storage device (where you have a disk mounted to a server), S3 is an object store. Each object is identified by a unique key (effectively a file path), stored in a bucket (a named container), and accessible via an HTTPS URL or the AWS API.
Unlimited Scale
No capacity limits. One bucket can hold one file or one trillion files. You never provision storage.
Durability by Default
AWS automatically stores each object redundantly across multiple AZs. No replication setup required.
Native AWS Integration
Lambda, RDS, Athena, CloudFront, CodeBuild — dozens of services have native S3 integration built in.
Cost-Effective
$0.023/GB/month for Standard. Glacier Deep Archive is $0.00099/GB — sub-$1 per terabyte.
Core Concepts: Buckets, Objects, Keys, and Versions
Buckets are the top-level containers for S3 objects. Bucket names must be globally unique across all AWS accounts. Bucket names are part of the S3 URL: https://my-bucket.s3.amazonaws.com/. Each bucket lives in one AWS region.
Objects are the individual files stored in S3. Each object can be up to 5 TB. Objects consist of the data itself plus metadata — the object's key, size, last modified timestamp, storage class, and any custom metadata you add.
Keys are the unique identifiers for objects within a bucket. Keys look like file paths (uploads/2026/03/photo.jpg) but S3 is a flat namespace — there are no real directories. The "/" character is a convention that tools use to display objects in a folder-like structure.
Versioning is an optional feature that stores every version of every object. When versioning is enabled, deleting an object adds a delete marker rather than removing the data. Enable versioning on buckets that store important data that must be recoverable from accidental deletion or overwrite.
# Path-style URL (being deprecated) https://s3.amazonaws.com/bucket-name/object-key # Virtual-hosted-style URL (current standard) https://bucket-name.s3.region.amazonaws.com/object-key # CloudFront CDN URL (for production web assets) https://d1234567890.cloudfront.net/object-key
S3 Storage Classes: Choosing the Right One
S3 offers eight storage classes with different pricing tiers for storage cost, retrieval cost, and retrieval latency. Choosing the right class for each type of data can reduce S3 storage costs by 60-90%.
| Storage Class | Use Case | Cost/GB/mo | Retrieval | Min Duration |
|---|---|---|---|---|
| S3 Standard | Frequently accessed data | $0.023 | Immediate, free | None |
| S3 Intelligent-Tiering | Unknown access patterns | $0.023 + monitoring | Immediate | None |
| S3 Standard-IA | Infrequently accessed, fast retrieval | $0.0125 | Immediate, $0.01/GB | 30 days |
| S3 One Zone-IA | Infrequent, non-critical, single AZ | $0.01 | Immediate, $0.01/GB | 30 days |
| S3 Glacier Instant | Archive with millisecond retrieval | $0.004 | Milliseconds, $0.03/GB | 90 days |
| S3 Glacier Flexible | Archive, 1-12 hour retrieval | $0.0036 | Minutes to hours | 90 days |
| S3 Glacier Deep Archive | Long-term archive | $0.00099 | 12-48 hours | 180 days |
"Moving infrequently accessed data from S3 Standard to S3 Glacier Instant Retrieval is an 83% cost reduction on storage — with millisecond retrieval still available when you need it."
S3 Cost Optimization PatternS3 Intelligent-Tiering monitors access patterns and automatically moves objects between Standard and IA tiers based on actual usage. It adds a small per-object monitoring fee ($0.0025 per 1,000 objects per month) but can be the right choice when access patterns are genuinely unpredictable.
S3 Security: Bucket Policies, ACLs, and Block Public Access
Every S3 bucket should have Block Public Access enabled at the account level unless you are intentionally serving public static content. From there, use bucket policies to grant specific cross-account or service access, and IAM roles to grant application access. Never make individual objects public unless they are intended to be publicly accessible.
Four Settings That Override ACLs
- Blocks new public ACLs from being set
- Ignores existing public ACLs on objects
- Blocks new bucket policies allowing public access
- Ignores public policies on existing buckets
S3 Private + CloudFront OAC
- S3 bucket stays private (Block Public Access ON)
- CloudFront distribution with Origin Access Control
- Public web traffic through CloudFront only
- Prevents direct S3 URL access bypassing CDN
Bucket policies are JSON documents attached to a bucket that grant or deny access to principals (AWS accounts, IAM roles, AWS services). This policy allows CloudFront (with OAC) to read objects:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::123456789:distribution/DIST_ID"
}
}
}]
}
Lifecycle Policies: Automate Cost Savings
S3 Lifecycle policies automatically transition objects between storage classes or delete them after a defined period. A well-configured lifecycle policy is the easiest way to reduce S3 costs without any ongoing maintenance.
Example lifecycle policy for an application log bucket:
- After 30 days: transition to S3 Standard-IA (access is infrequent after the first month)
- After 90 days: transition to S3 Glacier Flexible Retrieval (logs over 90 days old are rarely accessed)
- After 365 days: delete (logs over 1 year old have no business value)
For a bucket storing 1 TB of logs that grows by 50 GB/month, this policy can reduce annual S3 costs by 60-70% compared to leaving all logs in Standard storage.
resource "aws_s3_bucket_lifecycle_configuration" "logs" { bucket = aws_s3_bucket.logs.id rule { id = "log_rotation" status = "Enabled" transition { days = 30 storage_class = "STANDARD_IA" } transition { days = 90 storage_class = "GLACIER" } expiration { days = 365 } } }
Real Use Cases: What Developers Actually Store in S3
User-Uploaded Media
Images, videos, documents. S3 + CloudFront is the standard architecture. Use pre-signed URLs for direct-upload from the browser.
Database Backups
RDS automated backups go to S3. Lifecycle policies archive old backups to Glacier. Cross-region replication adds disaster recovery.
Data Lake Foundation
Raw CSV, JSON, and Parquet files from operational systems. Athena queries directly with SQL. No managed cluster required.
CI/CD Artifacts
Build artifacts from CodeBuild, compiled Lambda packages, Docker image layers (ECR uses S3 under the hood).
Static Website Hosting with S3
S3 static website hosting serves HTML files directly from a bucket. Combined with CloudFront for HTTPS and CDN caching, and Route 53 for custom domain routing, this is the standard way to host static sites and single-page applications at essentially zero cost for low-traffic sites.
Setup steps:
- Create an S3 bucket with the same name as your domain (e.g.,
www.example.com) - Upload your build output (
dist/orbuild/directory) to the bucket - Create a CloudFront distribution with the S3 bucket as the origin, using Origin Access Control
- Request an ACM certificate for your domain (free, auto-renewing)
- Attach the certificate to your CloudFront distribution
- Create a Route 53 alias record pointing your domain to the CloudFront distribution
Cost for a static site with 10,000 visitors/month: approximately $0.50-$2.00/month total (S3 storage + CloudFront data transfer + Route 53 hosted zone). Compare to $5-25/month for a managed hosting platform doing the same thing.
For React/Vue/Angular apps, add a CloudFront error page rule to redirect all 404s back to /index.html. This enables client-side routing to work correctly when a user bookmarks a deep link and navigates directly to it.
Frequently Asked Questions
How much does S3 storage cost?
S3 Standard costs $0.023 per GB per month in the US East (N. Virginia) region. Storing 100 GB costs $2.30/month. Data transfer out to the internet costs $0.09 per GB after the first 1 GB free per month. PUT, GET, and other request costs are fractions of a cent per 1,000 requests. Most small applications spend under $5/month on S3.
Is S3 a database?
No. S3 is an object store, not a database. You cannot query S3 objects with SQL (though Athena can query structured files stored in S3). S3 is optimized for storing and retrieving whole files, not for individual record lookups, updates, or transactions. Use DynamoDB or RDS for application data and S3 for files.
How do I make an S3 object publicly accessible?
Disable the bucket-level Block Public Access setting, then either set the object ACL to public-read or add a bucket policy that allows s3:GetObject for all principals (Principal: *). For serving web assets publicly, the recommended approach is to keep the bucket private and use CloudFront with Origin Access Control.
What is the maximum file size in S3?
A single S3 object can be up to 5 TB. Files larger than 100 MB should be uploaded using the multipart upload API, which splits the file into parts (5 MB to 5 GB each) and uploads them in parallel. The AWS CLI and SDKs handle multipart uploads automatically for large files.
Verdict: Master S3 Early and Everything Else Gets Easier
S3 is the one AWS service that appears in almost every architecture diagram. Master the storage class selection (use Intelligent-Tiering when in doubt), lock down permissions with Block Public Access and IAM roles, set lifecycle policies from day one, and use CloudFront in front of everything you serve publicly. These four habits turn S3 from a potential liability into one of the most cost-efficient and reliable storage systems available anywhere.
S3 is the foundation of every AWS architecture. Get the skills.
Join professionals from Denver, NYC, Dallas, LA, and Chicago for a 2-day in-person AI training bootcamp. $1,490. June–October 2026 (Thu–Fri). Seats are limited.
Reserve Your SeatS3 is the most unsung piece of the AI infrastructure stack — your data pipeline lives or dies here.
Most AI infrastructure discussions focus on compute — which GPU, which inference endpoint, which model. The less glamorous truth is that data ingestion, staging, and movement through S3 is where many production AI pipelines spend most of their wall-clock time and a surprising fraction of their cost. S3 GET request pricing, data transfer costs, and the latency difference between S3 Standard and S3 Express One Zone are often larger line items in an AI application's AWS bill than the inference compute itself, particularly for document-processing and RAG-style workloads that read large volumes of files.
S3 Express One Zone, launched in late 2023, is still underutilized. It offers 10x lower latency than standard S3 for hot-data access at roughly 5x the storage cost per GB. For AI applications that repeatedly access the same training datasets, embedding caches, or chunked document stores, that latency reduction compounds through the entire pipeline. The pricing premium pays for itself quickly if your workload is read-heavy. Most developers default to Standard S3 for everything without evaluating whether Express One Zone fits their hot-path data patterns.
The S3 skill that will save you money in an AI context: understand S3 Select and S3 Object Lambda. S3 Select lets you query inside Parquet or CSV files without reading the whole object, which can reduce data transfer by 90% for columnar filtering. That's a meaningful cost reduction for any AI application that reads structured datasets from S3.