A production-ready ML API that accepts JSON input, runs it through your sklearn pipeline, and returns predictions with confidence scores — deployed and accessible via a public URL.
Wrap the Model in FastAPI
A model sitting on your laptop is worthless. Wrap it in a REST API and it becomes a service that your frontend, other APIs, and automation tools can all use.
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import joblib
import numpy as np
from typing import List
app = FastAPI(title="Cancer Classifier API")
# Load once on startup
pipeline = joblib.load("model_pipeline.pkl")
model = pipeline["model"]
scaler = pipeline["scaler"]
class PredictRequest(BaseModel):
features: List[float] # 30 feature values
class PredictResponse(BaseModel):
prediction: int # 0=benign, 1=malignant
label: str
confidence: float
@app.post("/predict", response_model=PredictResponse)
def predict(req: PredictRequest):
if len(req.features) != 30:
raise HTTPException(400, "Expected 30 features")
X = np.array(req.features).reshape(1, -1)
X_scaled = scaler.transform(X)
pred = int(model.predict(X_scaled)[0])
prob = float(model.predict_proba(X_scaled)[0, pred])
return PredictResponse(
prediction=pred,
label="Malignant" if pred == 1 else "Benign",
confidence=prob
)Add Health Check and Model Info Endpoints
Production APIs need observability endpoints. A health check lets load balancers verify the service is running. A model info endpoint documents what the API expects.
import datetime
@app.get("/")
def health():
return {
"status": "ok",
"model": "RandomForest v1.0",
"timestamp": datetime.datetime.now().isoformat()
}
@app.get("/model-info")
def model_info():
return {
"algorithm": "Random Forest Classifier",
"n_estimators": model.n_estimators,
"n_features": model.n_features_in_,
"classes": ["Benign (0)", "Malignant (1)"],
"training_features": pipeline["features"]
}
# Run: uvicorn app:app --reload
# Test: curl -X POST http://localhost:8000/predict \
# -H "Content-Type: application/json" \
# -d '{"features": [17.99,10.38,122.8,1001,0.1184,...]}'Docker for Reproducible Deployment
Docker packages your app and all its dependencies into a container that runs identically everywhere. This is the standard for deploying ML models.
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy app and model
COPY app.py model_pipeline.pkl ./
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]# Build the image
docker build -t ml-api .
# Run it locally
docker run -p 8000:8000 ml-api
# Test it
curl http://localhost:8000/Deploy to Railway
Railway deploys Docker containers with one command. Free tier is sufficient for demos and prototypes. You get a public URL immediately.
# Install Railway CLI
brew install railway
# Login and init
railway login
railway init
# Deploy (Railway detects Dockerfile automatically)
railway up
# Deploying... done!
# https://ml-api-production.up.railway.app
# Test the public URL
curl https://your-url.up.railway.app/
# {"status":"ok","model":"RandomForest v1.0",...}Model file size: If your .pkl file is large (over 100MB), store it in cloud storage (S3, R2) and download it on startup instead of baking it into the Docker image. Use the huggingface_hub library for large model weights.
What You Learned Today
- Wrapped a scikit-learn model in a FastAPI service with proper Pydantic request/response models
- Added health check and model info endpoints for production observability
- Containerized the API with Docker for reproducible deployment anywhere
- Deployed to Railway and got a public URL in under 5 minutes
Go Further on Your Own
- Add input validation that checks feature ranges and returns a 400 error for outliers
- Add /batch endpoint that accepts a list of feature arrays and returns all predictions at once
- Set up a GitHub Actions workflow that builds and pushes the Docker image to Railway on every push to main
Course Complete!
You finished all 5 days. Ready to go deeper?
Reserve Your Bootcamp SeatWant live instruction and hands-on projects? Join the AI bootcamp — 3 days, 5 cities.