Key Takeaways
- Decompose before forecasting: Split your time series into trend, seasonality, and residuals before choosing a model. Understanding which components are present determines the best approach.
- Prophet is the best starting point: Facebook's Prophet handles seasonality, holidays, and trend changepoints automatically. It is the most productive starting tool for business forecasting.
- Never use random splits for time series: Always split chronologically — earlier dates for training, later for testing. Random splits allow future data to leak into training, producing inflated performance estimates.
- ML with lagged features often beats ARIMA: For complex time series with multiple drivers, XGBoost with lagged features and date features often outperforms ARIMA. Compare both on a proper validation set.
Time series forecasting is one of the most valuable skills in data science because almost every business decision involves predicting something that changes over time. This guide covers decomposition, Prophet, ARIMA, and ML-based forecasting with Python.
Decomposition
import pandas as pd from statsmodels.tsa.seasonal import seasonal_decompose df = pd.read_csv('sales.csv', parse_dates=['date'], index_col='date') result = seasonal_decompose(df['revenue'], model='additive', period=52) result.plot()
Trend: Long-run direction (growing, declining, flat).
Seasonality: Repeating patterns (weekly, monthly, annual).
Residuals: Remaining random noise after removing trend and seasonality.
Prophet Forecasting
from prophet import Prophet import pandas as pd # Prophet requires columns named 'ds' (date) and 'y' (value) df = df.rename(columns={'date': 'ds', 'revenue': 'y'}) model = Prophet( yearly_seasonality=True, weekly_seasonality=True, changepoint_prior_scale=0.05 ) model.add_country_holidays(country_name='US') model.fit(df) future = model.make_future_dataframe(periods=90) forecast = model.predict(future) fig = model.plot(forecast)
ML Approach: XGBoost with Lagged Features
import pandas as pd from xgboost import XGBRegressor from sklearn.metrics import mean_absolute_error # Create lagged features df['lag_7'] = df['revenue'].shift(7) df['lag_28'] = df['revenue'].shift(28) df['rolling_7_mean'] = df['revenue'].rolling(7).mean() df['day_of_week'] = df.index.dayofweek df['month'] = df.index.month # Chronological train/test split (NEVER random) split = '2025-10-01' train = df[df.index < split].dropna() test = df[df.index >= split].dropna() features = ['lag_7', 'lag_28', 'rolling_7_mean', 'day_of_week', 'month'] model = XGBRegressor(n_estimators=300, random_state=42) model.fit(train[features], train['revenue']) preds = model.predict(test[features]) print(f"MAE: {mean_absolute_error(test['revenue'], preds):.2f}")
Evaluating Forecasts
Always compare against a naive baseline (last value, same period last year). If your model cannot beat the naive forecast, it provides no value.
Metrics: MAE (mean absolute error, interpretable in original units), MAPE (mean absolute percentage error, easy to communicate), RMSE (penalizes large errors more than MAE).
Use TimeSeriesSplit from scikit-learn for cross-validation — it creates folds that preserve chronological order. Never use regular KFold on time series.
Frequently Asked Questions
What is stationarity in time series?
A stationary time series has constant mean and variance over time. Many classical forecasting models (ARIMA) require stationarity. Test with the Augmented Dickey-Fuller test (from statsmodels). If non-stationary, apply differencing (subtract each value from the previous) until the ADF test indicates stationarity.
When should I use Prophet vs ARIMA?
Use Prophet for business time series with multiple seasonalities, holiday effects, and trend changepoints. It is easy to use and interprets well. Use ARIMA for simpler time series where the statistical model needs to be transparent and well-specified. In practice, try both and compare on a validation set.
How do I avoid data leakage in time series?
Never use random train/test splits. Always split chronologically — earlier dates for training, later dates for testing. Use TimeSeriesSplit from scikit-learn for cross-validation. Ensure lagged features use only past values, not any information that would not be available at prediction time.
What is the naive forecast and why does it matter?
A naive forecast predicts the next value as equal to the last observed value (or the same period last year for seasonal data). It is the simplest possible baseline. Your model must outperform the naive forecast to be useful. A model that cannot beat last week's sales as a prediction for this week provides no value.
Forecasting turns data into competitive advantage. Get the skills.
Join professionals from Denver, NYC, Dallas, LA, and Chicago for two days of hands-on AI and tech training. $1,490. October 2026. Seats are limited.
Reserve Your SeatNote: Information reflects early 2026.