Time Series Analysis: Forecasting with Python

Time series analysis guide: trends, seasonality, stationarity, Prophet, ARIMA, and building forecasting models in Python with real code examples.

15
Min Read
Top 200
Kaggle Author
Apr 2026
Last Updated
5
US Bootcamp Cities

Key Takeaways

Time series forecasting is one of the most valuable skills in data science because almost every business decision involves predicting something that changes over time. This guide covers decomposition, Prophet, ARIMA, and ML-based forecasting with Python.

01

Decomposition

import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose

df = pd.read_csv('sales.csv', parse_dates=['date'], index_col='date')
result = seasonal_decompose(df['revenue'], model='additive', period=52)
result.plot()

Trend: Long-run direction (growing, declining, flat).
Seasonality: Repeating patterns (weekly, monthly, annual).
Residuals: Remaining random noise after removing trend and seasonality.

02

Prophet Forecasting

from prophet import Prophet
import pandas as pd

# Prophet requires columns named 'ds' (date) and 'y' (value)
df = df.rename(columns={'date': 'ds', 'revenue': 'y'})

model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    changepoint_prior_scale=0.05
)
model.add_country_holidays(country_name='US')
model.fit(df)

future = model.make_future_dataframe(periods=90)
forecast = model.predict(future)
fig = model.plot(forecast)
03

ML Approach: XGBoost with Lagged Features

import pandas as pd
from xgboost import XGBRegressor
from sklearn.metrics import mean_absolute_error

# Create lagged features
df['lag_7'] = df['revenue'].shift(7)
df['lag_28'] = df['revenue'].shift(28)
df['rolling_7_mean'] = df['revenue'].rolling(7).mean()
df['day_of_week'] = df.index.dayofweek
df['month'] = df.index.month

# Chronological train/test split (NEVER random)
split = '2025-10-01'
train = df[df.index < split].dropna()
test = df[df.index >= split].dropna()

features = ['lag_7', 'lag_28', 'rolling_7_mean', 'day_of_week', 'month']
model = XGBRegressor(n_estimators=300, random_state=42)
model.fit(train[features], train['revenue'])
preds = model.predict(test[features])
print(f"MAE: {mean_absolute_error(test['revenue'], preds):.2f}")
04

Evaluating Forecasts

Always compare against a naive baseline (last value, same period last year). If your model cannot beat the naive forecast, it provides no value.

Metrics: MAE (mean absolute error, interpretable in original units), MAPE (mean absolute percentage error, easy to communicate), RMSE (penalizes large errors more than MAE).

Use TimeSeriesSplit from scikit-learn for cross-validation — it creates folds that preserve chronological order. Never use regular KFold on time series.

05

Frequently Asked Questions

What is stationarity in time series?

A stationary time series has constant mean and variance over time. Many classical forecasting models (ARIMA) require stationarity. Test with the Augmented Dickey-Fuller test (from statsmodels). If non-stationary, apply differencing (subtract each value from the previous) until the ADF test indicates stationarity.

When should I use Prophet vs ARIMA?

Use Prophet for business time series with multiple seasonalities, holiday effects, and trend changepoints. It is easy to use and interprets well. Use ARIMA for simpler time series where the statistical model needs to be transparent and well-specified. In practice, try both and compare on a validation set.

How do I avoid data leakage in time series?

Never use random train/test splits. Always split chronologically — earlier dates for training, later dates for testing. Use TimeSeriesSplit from scikit-learn for cross-validation. Ensure lagged features use only past values, not any information that would not be available at prediction time.

What is the naive forecast and why does it matter?

A naive forecast predicts the next value as equal to the last observed value (or the same period last year for seasonal data). It is the simplest possible baseline. Your model must outperform the naive forecast to be useful. A model that cannot beat last week's sales as a prediction for this week provides no value.

Note: Information reflects early 2026.

AI Instructor & Founder, Precision AI Academy

Bo has trained 400+ professionals in applied AI across federal agencies and Fortune 500 companies.

The Bottom Line
You don't need to master everything at once. Start with the fundamentals in Time Series Analysis, apply them to a real project, and iterate. The practitioners who build things always outpace those who just read about building things.

Build Real Skills. In Person. This October.

The 2-day in-person Precision AI Academy bootcamp. 5 cities (Denver, NYC, Dallas, LA, Chicago). $1,490. 40 seats max. June–October 2026 (Thu–Fri).

Reserve Your Seat
PA
Our Take

LLMs are surprisingly good at time series — but not in the way you'd expect.

The most interesting development in time series analysis over the last 18 months is not a new statistical model — it is the emergence of foundation models for time series. Google's TimesFM and Amazon's Chronos are pre-trained models that can zero-shot forecast on new time series data without fitting a model from scratch. For organizations with limited historical data or many independent time series to forecast simultaneously, these models are meaningfully better than traditional approaches like ARIMA or even Prophet. The mainstream time series curriculum has not caught up to this shift yet.

That said, the foundation model hype requires a reality check. For well-understood time series with clean data, seasonal patterns, and sufficient history — retail sales forecasting, utility demand planning, financial time series — classical approaches like Facebook Prophet, statsmodels SARIMAX, and gradient boosted trees with lag features are still highly competitive and more interpretable. The business cases where interpretability matters (regulated industries, decision-makers who want to understand the model) still favor classical methods. Our bet is that foundation models for time series will dominate novel or data-sparse forecasting use cases, while classical methods hold their ground for high-stakes production forecasting where auditability is required.

If you are learning time series for a data science or ML engineering role: master the fundamentals first — stationarity, autocorrelation, seasonal decomposition. These concepts are prerequisite to understanding why any model works or fails. Then layer in Prophet for rapid prototyping and XGBoost with engineered lag features for production-grade forecasting. Foundation models are worth learning once you have the fundamentals.

PA

Published By

Precision AI Academy

Practitioner-focused AI education · 2-day in-person bootcamp in 5 U.S. cities

Precision AI Academy publishes deep-dives on applied AI engineering for working professionals. Founded by Bo Peng (Kaggle Top 200) who leads the in-person bootcamp in Denver, NYC, Dallas, LA, and Chicago.

Kaggle Top 200 Federal AI Practitioner 5 U.S. Cities Thu–Fri Cohorts