Build intuition for probability: distributions, conditional probability, Bayes' theorem, and how probability underpins every AI model.
Every machine learning model outputs a probability — whether it shows it to you or not. Classification confidence scores, recommendation rankings, anomaly detection thresholds — they're all probability statements. Understanding probability means understanding what your model is actually saying.
Two events A and B:
import numpy as np
np.random.seed(42)
n = 100_000 # simulate many trials
# P(two fair coins both heads)
coins = np.random.choice([0, 1], size=(n, 2))
p_both_heads = np.mean(coins.sum(axis=1) == 2)
print(f"P(HH): {p_both_heads:.4f}") # ~0.25
# Conditional: P(second head | first head)
first_head = coins[coins[:, 0] == 1]
p_second_given_first = np.mean(first_head[:, 1] == 1)
print(f"P(H2|H1): {p_second_given_first:.4f}") # ~0.50 (independent)Bayes' theorem is how you update a probability estimate when you get new evidence: P(A|B) = P(B|A) × P(A) / P(B)
# Medical test for rare disease
# Disease prevalence: 1% of population
# Test accuracy: 99% true positive rate, 1% false positive rate
p_disease = 0.01 # prior
p_positive_given_disease = 0.99 # sensitivity
p_positive_given_healthy = 0.01 # false positive rate
# P(positive test)
p_positive = (p_positive_given_disease * p_disease +
p_positive_given_healthy * (1 - p_disease))
# Bayes: P(disease | positive test)
p_disease_given_positive = (p_positive_given_disease * p_disease) / p_positive
print(f"P(disease | positive test): {p_disease_given_positive:.1%}")
# ~50% — even with a 99% accurate test, only 50% of positives are true
# Because the disease is rare (1%)Normal: Bell curve, symmetric. Good for height, measurement errors. Binomial: Count of successes in N trials. Good for click-through rates, pass/fail. Poisson: Count of events per time period. Good for arrivals, bug rates, rare events.
import matplotlib.pyplot as plt
fig, axes = plt.subplots(1, 3, figsize=(12, 3))
np.random.seed(42)
axes[0].hist(np.random.normal(0, 1, 1000), bins=30)
axes[0].set_title('Normal(μ=0, σ=1)')
axes[1].hist(np.random.binomial(100, 0.3, 1000), bins=30)
axes[1].set_title('Binomial(n=100, p=0.3)')
axes[2].hist(np.random.poisson(5, 1000), bins=30)
axes[2].set_title('Poisson(λ=5)')
plt.tight_layout()
plt.savefig('distributions.png')