In This Article
- The Six REST Principles That Actually Matter
- URL Naming Conventions
- HTTP Methods: When to Use Each One
- HTTP Status Codes: The Complete Guide
- Request and Response Design
- API Versioning Strategies
- Authentication: API Keys vs JWT vs OAuth 2.0
- Rate Limiting and Throttling
- OpenAPI and Swagger Documentation
- Designing APIs for AI Services
- Frequently Asked Questions
Key Takeaways
- What is the most important REST API design principle? Statelessness is the foundational REST principle that matters most in practice.
- When should I use PUT vs PATCH? Use PUT when you are replacing an entire resource — the client sends the complete representation and the server overwrites whatever was there.
- What API versioning strategy should I use? URL path versioning (/v1/, /v2/) is the most pragmatic choice for most teams in 2026.
- Should I use API keys, JWT, or OAuth 2.0? Use API keys for server-to-server integrations where a human is not directly in the flow — machine clients, CI/CD pipelines, data pipelines.
REST APIs are the connective tissue of modern software. Every mobile app, every SaaS product, every microservice architecture depends on them. And yet most developers learn REST by imitation — copying patterns from tutorials, inheriting designs from existing codebases, and discovering the mistakes only when the API is already in production and hard to change.
In 2026, good API design matters more than ever. AI services, streaming responses, asynchronous job patterns, and multi-tenant architectures have added new requirements on top of the fundamentals. This guide covers everything: the principles, the naming conventions, the status codes, the versioning debate, the authentication tradeoffs, and the new patterns that AI services demand.
The Six REST Principles That Actually Matter
REST's most consequential constraint is statelessness: every request must carry everything the server needs to process it — no server-side sessions, no shared state between calls. Statelessness enables horizontal scaling and load balancing without coordination. The other five constraints (client-server separation, cacheability, layered system, uniform interface, code-on-demand) follow from this core design decision.
REST (Representational State Transfer) was defined by Roy Fielding in his 2000 dissertation. Six architectural constraints define it. In practice, most "REST APIs" only implement some of them — but the ones you skip have real consequences.
Statelessness (the most important one)
Each request must contain everything the server needs to process it. No server-side sessions. No "remember what I asked last time." The server is an amnesiac — and that is exactly the right design. Statelessness is what makes horizontal scaling, load balancing, and caching possible without coordination overhead. If your API stores client context between calls, you have introduced hidden coupling that will hurt you during outages and scaling events.
Uniform Interface
Resources are identified in requests (via URIs), and they are manipulated through representations (JSON, XML). The interface is consistent — the same patterns apply everywhere in the API. This constraint is why REST APIs are so learnable: once you understand one endpoint, you have a mental model for all of them.
Resource-Based Architecture
Everything is a resource — a noun, not a verb. You do not call POST /createUser. You call POST /users. The resource is the center of the design. Actions are expressed through HTTP methods, not URL paths. This separation keeps APIs predictable and self-documenting.
The Three Most Violated REST Constraints
- Statelessness: Storing session state server-side (breaks scaling, causes bugs on load balancer switches)
- Uniform interface: Mixing verbs into URLs (
/getUser,/deleteOrder) - Layered system: Building direct database-to-client coupling with no abstraction layer
URL Naming Conventions
REST URL design has one central rule: use nouns, not verbs — the HTTP method is the verb. Use lowercase plural nouns for collections (/users, /orders), nest resources to show hierarchy (/users/123/orders), use hyphens not underscores for multi-word paths, and keep URLs case-insensitive. Everything else in URL design follows from these four rules.
URL design is the first thing consumers see. Good URLs are self-documenting. Bad URLs are a permanent source of confusion. These conventions are the closest thing to a universal standard that the REST world has.
Use Nouns, Not Verbs
The HTTP method is the verb. The URL is the noun. This combination gives you a complete action without redundancy.
# Good — resource-oriented
GET /users
GET /users/{id}
POST /users
PUT /users/{id}
DELETE /users/{id}
# Good — nested resources
GET /users/{id}/orders
GET /users/{id}/orders/{orderId}
POST /users/{id}/orders
# Bad — verb-in-URL anti-pattern
GET /getUser
POST /createUser
POST /deleteUser?id=123
GET /user/fetchAllOrders
Plural Nouns for Collections
Use /users, not /user. Use /orders, not /order. Collections are plural. Individual resources within a collection are accessed by ID: /users/42. The consistency matters more than the specific choice — pick one and apply it everywhere.
Lowercase, Hyphens, No Underscores
URLs are case-sensitive on most servers. Keep everything lowercase. Use hyphens to separate words in URL segments: /product-categories, not /productCategories or /product_categories. Hyphens are more readable and less prone to copy/paste issues.
Keep Nesting Shallow
Beyond two levels of nesting, URLs become unwieldy. /users/{id}/orders/{orderId}/items/{itemId} is the edge of acceptable. If you find yourself going deeper, consider flattening by exposing the nested resource directly: /order-items/{itemId}.
URL Naming Quick Reference
- Plural nouns for collections:
/users,/products,/orders - ID-based access:
/users/{userId} - Nested relationships:
/users/{userId}/addresses - Lowercase and hyphenated:
/product-categories - No verbs in URLs: never
/getUseror/deleteOrder - Query strings for filtering/sorting:
/products?category=electronics&sort=price
HTTP Methods: When to Use Each One
Each HTTP method carries a semantic contract: GET reads without side effects (safe and idempotent), POST creates and is not idempotent, PUT replaces a full resource and is idempotent, PATCH modifies specific fields and may or may not be idempotent, DELETE removes a resource and is idempotent. Violating these contracts breaks caching, retry logic, and every HTTP-aware proxy in the client's stack.
The five primary HTTP methods map to the five fundamental operations on a resource. Using them correctly — and understanding the semantic guarantees each one carries — is the difference between a predictable API and one that surprises its consumers.
| Method | Purpose | Idempotent? | Safe? | Has Body? |
|---|---|---|---|---|
| GET | Retrieve a resource or collection | Yes | Yes | No |
| POST | Create a new resource | No | No | Yes |
| PUT | Replace an entire resource | Yes | No | Yes |
| PATCH | Partially update a resource | Depends | No | Yes |
| DELETE | Remove a resource | Yes | No | Optional |
Idempotency means calling the same operation multiple times produces the same result as calling it once. GET, PUT, and DELETE are idempotent — retrying them on a network failure is safe. POST is not — retrying a POST might create two records. This semantic difference should drive your retry logic and client error handling.
Safety means the operation does not modify server state. Only GET is safe (and HEAD, OPTIONS — less commonly used). Safe methods can be freely cached and prefetched without side effects.
The PUT vs PATCH Decision
Use PUT when replacing the entire resource state — the client sends a complete representation. Use PATCH for partial updates — only the fields in the request body change. PATCH is more efficient for large resources where you only need to update one or two fields. In most modern APIs, PATCH is the right default for user-initiated edits.
HTTP Status Codes: The Complete Guide
HTTP status codes communicate outcomes so clients can react without parsing error messages: 2xx means success (200 OK, 201 Created, 204 No Content), 4xx means client error (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 429 Too Many Requests), 5xx means server error. Never return 200 OK with an error body — it breaks every HTTP-aware tool in the client's stack.
Status codes are the API's primary mechanism for communicating outcomes. Using them correctly means clients can react intelligently without parsing error messages. Using them wrong — returning 200 OK with an error body, for example — breaks every HTTP-aware tool in the client's stack.
2xx — Success
| Code | Name | When to Use |
|---|---|---|
| 200 | OK | Successful GET, PUT, PATCH — response body contains the resource |
| 201 | Created | Successful POST that created a new resource — include Location header |
| 204 | No Content | Successful DELETE or PUT when no body is returned |
| 202 | Accepted | Request accepted for async processing — job is queued, not complete |
4xx — Client Errors
| Code | Name | When to Use |
|---|---|---|
| 400 | Bad Request | Malformed request syntax, invalid parameters, missing required fields |
| 401 | Unauthorized | Missing or invalid authentication credentials |
| 403 | Forbidden | Authenticated but not authorized — valid token, wrong permissions |
| 404 | Not Found | Resource does not exist at this URI |
| 409 | Conflict | Request conflicts with current state — duplicate email, version mismatch |
| 422 | Unprocessable Entity | Syntactically valid but semantically wrong — well-formed JSON, bad business logic |
| 429 | Too Many Requests | Rate limit exceeded — include Retry-After header |
5xx — Server Errors
| Code | Name | When to Use |
|---|---|---|
| 500 | Internal Server Error | Unexpected server failure — log it, never expose stack traces to clients |
| 502 | Bad Gateway | Upstream service returned invalid response |
| 503 | Service Unavailable | Server temporarily unable to handle requests — include Retry-After |
| 504 | Gateway Timeout | Upstream service did not respond in time |
The Status Code Anti-Pattern That Breaks Everything
Never return 200 OK with an error body like {"success": false, "error": "User not found"}. This breaks HTTP caching, monitoring tools, API gateways, and every client that does standard HTTP error handling. Return the appropriate 4xx or 5xx code. The body can contain error detail — but the status code must reflect the actual outcome.
Request and Response Design
Use camelCase JSON field names, wrap collections in a consistent envelope with a data array and a meta object for pagination, and keep error responses consistent with a machine-readable code and human-readable message. Inconsistent response shapes — different structures for different endpoints — are the single most common complaint from API consumers.
Beyond the URL and method, the shape of your request and response bodies determines how pleasant your API is to consume. A few patterns have emerged as near-universal best practices in 2026.
Consistent JSON Structure
Use camelCase for JSON field names (matching JavaScript conventions). Return a consistent envelope for collections: a data array, a meta object for pagination, and optionally links for HATEOAS navigation. Keep error responses consistent: always include a machine-readable code and a human-readable message.
{
"data": [
{ "id": "usr_01J3K", "name": "Alice Chen", "email": "[email protected]" },
{ "id": "usr_02M9P", "name": "Bob Davis", "email": "[email protected]" }
],
"meta": {
"total": 1482,
"page": 1,
"perPage": 20,
"totalPages": 75
},
"links": {
"self": "/v1/users?page=1",
"next": "/v1/users?page=2",
"last": "/v1/users?page=75"
}
}
{
"error": {
"code": "VALIDATION_FAILED",
"message": "The request body contains invalid fields.",
"details": [
{ "field": "email", "issue": "Must be a valid email address" },
{ "field": "age", "issue": "Must be a positive integer" }
],
"requestId": "req_9xKpL3mNqR"
}
}
Pagination
Never return unbounded collections. Always paginate. Two patterns dominate: offset/limit (?page=3&perPage=20) is simple and familiar, but inefficient on large datasets where deep pages require counting all prior records. Cursor pagination (?after=cursor_abc123) is more efficient and consistent for real-time data where new records are continuously inserted. Choose cursor pagination if you expect large datasets or real-time feeds; use offset pagination for everything else.
Filtering and Sorting
Keep filtering in query parameters. Keep it readable and predictable:
# Filtering
GET /orders?status=pending&customerId=usr_01J3K
# Sorting (prefix - for descending)
GET /products?sort=-price # price descending
GET /products?sort=name,-createdAt # name asc, date desc
# Field selection (sparse fieldsets)
GET /users?fields=id,name,email
# Search
GET /products?q=bluetooth+speaker
Build APIs That Actually Get Used
Our October 2026 bootcamp covers REST API design, backend architecture, and AI integration in hands-on sessions across 5 cities. Seats are limited to 40 per city.
Reserve Your Seat — $1,490API Versioning Strategies
URL versioning (/v1/users) is the most visible and easiest for clients to adopt — it is the approach used by Stripe, Twilio, and most major public APIs. Header versioning keeps URLs clean but requires client configuration. Never change an existing versioned endpoint's contract; instead, release a new version and deprecate the old one with clear sunset timelines.
Every API will need to change. The question is how you manage that change for clients already in production. Three versioning strategies are in common use, each with genuine tradeoffs.
| Strategy | Example | Pros | Cons | Best For |
|---|---|---|---|---|
| URL Path | /v1/users |
Explicit, cacheable, browser-friendly | URL bloat, copy/paste errors | Most public APIs |
| Request Header | API-Version: 2 |
Clean URLs, flexible routing | Invisible, harder to test, CDN complications | Internal APIs |
| Content-Type | Accept: application/vnd.api+json;v=2 |
Semantically correct per HTTP spec | Complex, rarely understood by clients | Rarely recommended |
The pragmatic recommendation in 2026 is URL path versioning. It is explicit, works without configuration in every HTTP client, is trivially testable in a browser, and is what every major public API (Stripe, GitHub, Twilio) uses. The "clean URL" argument for header versioning is real but rarely worth the operational complexity it introduces.
Versioning Best Practices
- Start at
/v1/even if you think you will never need v2. You will. - Never make breaking changes within a version — add fields, never remove or rename them.
- Support the previous version for at least 12 months after releasing the new one.
- Communicate deprecation timelines in response headers:
Sunset: Sat, 01 Jan 2028 00:00:00 GMT - Consider a changelog endpoint:
GET /changelogthat returns machine-readable version history.
Authentication: API Keys vs JWT vs OAuth 2.0
Use API keys for server-to-server integrations where a human is not in the auth flow. Use JWTs (15-minute expiry plus refresh tokens) for stateless microservice authentication where you need to pass user identity across services without database lookups. Use OAuth 2.0 for user-delegated authorization — when third-party applications need access to resources on behalf of your users.
Authentication is the most consequential API design decision you make. It determines who can access your API, how that access is granted and revoked, and what your operational attack surface looks like. In 2026, three approaches dominate — and each belongs in a different context.
API Keys
API keys are long-lived secrets passed in request headers (X-API-Key: your_key or Authorization: Bearer your_key). They are simple to implement, simple to use, and appropriate for server-to-server integrations where a human is not in the authentication flow. The weakness is lifecycle management — API keys are effectively permanent until revoked, and they are difficult to scope finely.
JWT (JSON Web Tokens)
JWTs are signed tokens that encode claims (user ID, roles, permissions) and can be verified without a database lookup. The server signs the token at login; subsequent requests carry the token, and the server validates the signature. JWTs are ideal for stateless authentication in microservice architectures where you want to pass user identity across services without coordination. The weakness is revocation — a JWT is valid until expiry, so short expiry times (15 minutes) combined with a refresh token pattern are essential.
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.
eyJzdWIiOiJ1c3JfMDFKM0siLCJuYW1lIjoiQWxpY2UgQ2hlbiIsInJvbGUi
OiJhZG1pbiIsImV4cCI6MTc0NDIyMDAwMH0.
SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
OAuth 2.0
OAuth 2.0 is the standard for delegated authorization — when a third-party application needs to act on behalf of your users. "Sign in with Google," GitHub's API integrations, Slack app permissions — these are all OAuth 2.0. It is more complex to implement correctly than API keys or JWT, but it is the right tool when you are building a platform that other developers will build on top of.
| Method | Best For | Revocable? | Stateless? | Complexity |
|---|---|---|---|---|
| API Keys | Server-to-server, developer integrations | Yes | No (DB lookup) | Low |
| JWT | Microservices, stateless user auth | Complex | Yes | Medium |
| OAuth 2.0 | Third-party app authorization, platforms | Yes | No | High |
"Authentication is not a feature to add later. The shape of your auth design propagates into every endpoint, every permission check, and every security audit. Build it right from the start."
Rate Limiting and Throttling
In 2026, with AI-powered clients capable of generating thousands of requests per second, rate limiting is foundational — not optional. Always communicate limit status via RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset headers per RFC 9110. Return 429 Too Many Requests when limits are exceeded and include a Retry-After header with seconds until the window resets.
Rate limiting protects your API from abuse, prevents individual clients from degrading service for others, and gives you control over operational costs. In 2026, with AI-powered clients capable of generating thousands of requests per second, rate limiting is not optional — it is foundational.
Rate Limiting Headers
Always communicate rate limit status to clients. The emerging standard (RFC 9110) uses three headers:
RateLimit-Limit: 100
RateLimit-Remaining: 43
RateLimit-Reset: 1744220060
# On 429 Too Many Requests:
Retry-After: 37
Rate Limiting Strategies
Fixed window is the simplest — 100 requests per minute, resetting on the clock. It allows burst spikes at window boundaries. Sliding window smooths this out by tracking requests in a rolling time window. Token bucket is the most flexible — clients accumulate tokens over time and spend them on requests, allowing short bursts while enforcing average rate limits. Token bucket is the right choice for APIs with variable-cost operations, like AI inference endpoints.
Rate Limiting by Tier
- Free tier: 60 requests/minute, 1,000 requests/day
- Developer tier: 300 requests/minute, 10,000 requests/day
- Production tier: 1,000 requests/minute, unlimited daily
- Enterprise: Custom limits negotiated per contract
Rate limit at multiple granularities: per-second burst protection + per-minute sustained limits + per-day quotas.
OpenAPI and Swagger Documentation
OpenAPI 3.1 is the definitive standard for REST API documentation in 2026. A single machine-readable YAML or JSON file generates interactive Swagger UI, client SDKs in any language, test suites, and mock servers automatically. An undocumented API is a liability; a well-documented OpenAPI spec is a force multiplier — consumers can onboard without asking questions.
An undocumented API is not an asset — it is a liability. In 2026, OpenAPI 3.1 is the definitive standard for REST API documentation. It is machine-readable YAML or JSON that generates interactive documentation, client SDKs, test suites, and mock servers automatically.
openapi: 3.1.0
info:
title: Precision API
version: 1.0.0
description: REST API for the Precision platform
paths:
/v1/users/{userId}:
get:
summary: Get a user by ID
operationId: getUser
tags: [Users]
parameters:
- name: userId
in: path
required: true
schema:
type: string
responses:
'200':
description: User found
content:
application/json:
schema:
$ref: '#/components/schemas/User'
'404':
description: User not found
security:
- BearerAuth: []
The key benefit of OpenAPI is not the documentation output — it is the contract. An OpenAPI spec becomes the single source of truth that development teams, QA, and consumers all reference. Tools like Swagger UI, Redoc, and Stoplight generate interactive documentation from the spec automatically. Prism generates a mock server. Speakeasy and Stainless generate typed client SDKs in multiple languages.
Designing APIs for AI Services
AI services require REST patterns that standard CRUD APIs never need: Server-Sent Events for streaming LLM token output (eliminating 5–30 second blank screens), async job endpoints for long-running inference (POST to create job, GET to poll status), and explicit model version headers for reproducibility. These patterns are now first-class design requirements for any API that wraps AI functionality.
AI services impose new requirements on REST API design. Language model inference, image generation, speech transcription, and embedding generation all share characteristics that do not fit standard request/response patterns cleanly: long processing times, streaming outputs, high per-request costs, and asynchronous job workflows.
Streaming Responses with Server-Sent Events
When a language model generates a response, it produces tokens one at a time over seconds. Waiting for the complete response before returning it creates a terrible user experience — the screen sits blank for 5–30 seconds. Streaming with Server-Sent Events (SSE) pushes each token to the client as it is generated, creating the familiar "typewriter" effect.
# Response headers for streaming
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
X-Accel-Buffering: no
# SSE event stream body
data: {"delta": "The ", "index": 0}
data: {"delta": "quick ", "index": 1}
data: {"delta": "brown fox", "index": 2}
data: [DONE]
Async Job Pattern for Long Operations
Some AI operations — video generation, document processing, fine-tuning jobs — take minutes or hours. The correct pattern is to immediately return a job ID with 202 Accepted, and provide a status polling endpoint. Better still, accept a webhook URL so the server can push results when complete rather than requiring the client to poll.
# Step 1 — Submit job
POST /v1/jobs
{
"type": "document_analysis",
"input": "s3://bucket/document.pdf",
"webhookUrl": "https://your-app.com/webhook/jobs"
}
# Response: 202 Accepted
{
"jobId": "job_7xKpL3mNqR",
"status": "queued",
"estimatedCompletionSeconds": 45,
"statusUrl": "/v1/jobs/job_7xKpL3mNqR"
}
# Step 2 — Poll status (or wait for webhook)
GET /v1/jobs/job_7xKpL3mNqR
# Response: 200 OK when complete
{
"jobId": "job_7xKpL3mNqR",
"status": "completed",
"result": { "summary": "...", "entities": [...] }
}
AI API Design Checklist
- Support streaming for any operation that generates text token by token (SSE or WebSocket)
- Use 202 + async job pattern for operations exceeding ~10 seconds
- Expose per-request cost metadata in response headers:
X-Tokens-Used: 1842 - Rate limit by cost units (tokens, compute seconds) not just request count
- Provide a cancel endpoint for long-running jobs:
DELETE /v1/jobs/{jobId} - Include model version in responses for reproducibility:
X-Model-Version: gpt-4o-2025-11
The bottom line: REST API design in 2026 comes down to four non-negotiable rules — stateless requests, correct HTTP methods with their semantic contracts, meaningful status codes that never lie, and OpenAPI 3.1 documentation that keeps clients unblocked. Get those right and your API is predictable, cacheable, and scalable. Get them wrong and every client integration becomes a debugging session.
Frequently Asked Questions
What is the most important REST API design principle?
Statelessness is the foundational REST principle that matters most in practice. Every request must contain all the information needed to process it — the server holds no session state between calls. This enables horizontal scaling, caching, and resilience that stateful server architectures cannot match. In 2026, with distributed microservices and serverless functions as the norm, designing for statelessness from day one prevents a category of architectural problems that are very painful to refactor away later.
When should I use PUT vs PATCH?
Use PUT when you are replacing an entire resource — the client sends the complete representation and the server overwrites whatever was there. Use PATCH when you are making a partial update — only the fields included in the request body are changed. In practice, PATCH is more common for user-facing APIs because clients rarely need to send every field. PUT is more appropriate for idempotent configuration operations where you want to ensure a resource matches an exact known state.
What API versioning strategy should I use?
URL path versioning (/v1/, /v2/) is the most pragmatic choice for most teams in 2026. It is explicit, easy to test in a browser, works correctly with caching proxies and CDN edge networks, and requires zero special client configuration. Header-based versioning is cleaner in theory but adds complexity for clients and is invisible in browser URL bars. Start with URL versioning and only reconsider if you have a specific technical constraint that forces it.
Should I use API keys, JWT, or OAuth 2.0?
Use API keys for server-to-server integrations where a human is not directly in the flow — machine clients, CI/CD pipelines, data pipelines. Use JWT for APIs where you need to pass user identity and claims without a database lookup on every request. Use OAuth 2.0 when third-party applications need to act on behalf of your users. In practice, many production APIs use all three: OAuth for third-party clients, JWT for internal services, and API keys for developer integrations.
Go from Tutorial Developer to Production Engineer
The Precision AI Academy October 2026 bootcamp covers REST APIs, backend architecture, AI integration, and real-world system design — in 3 days of intensive hands-on work.
Claim Your Seat — $1,490Sources: Stack Overflow Developer Survey 2025, GitHub Octoverse, TIOBE Programming Index
Explore More Guides
- Angular in 2026: The Complete Guide for Beginners and Enterprise Developers
- Angular Tutorial for Beginners in 2026: The Enterprise Framework Worth Learning
- FastAPI in 2026: Complete Guide to Building Production APIs with Python
- AI Agents Explained: What They Are & Why They're the Biggest Shift in Tech (2026)
- AI Career Change: Transition Into AI Without a CS Degree