Bitcoin Price Prediction Using Machine Learning (VAR, XGBoost, Facebook Prophet) — Python Tutorial

During a wild year in the markets, the riskiest asset is one of the popular today. After a historic climb and crash three years ago, bitcoin is at it again in 2020. Like last time, a surge of investor enthusiasm is driving it to all-time highs, and bitcoin enthusiasts say that this is only the beginning. They see bitcoin as a revolution in financial systems, this has been the message for about 11 years and people are finally starting to listen and take note. But many experts watching bitcoin warn that the asset is too volatile for average investors and that it's a purely speculative bet that won’t be able to scale.
Time Series • Forecasting • Python

Forecasting Bitcoin Prices (VAR, XGBoost, Prophet) — A Practical Python Walkthrough

Bitcoin is a high-volatility asset. That volatility is exactly why forecasting is hard—and why the modeling workflow matters more than any single algorithm. This article outlines a clean, reproducible pipeline for forecasting BTC using three complementary approaches: a multivariate VAR model, a feature-based XGBoost regressor, and Prophet for decomposable trend + seasonality baselines.

Scope & assumptions

  • This is forecasting, not trading advice. A model that reduces error does not automatically translate into a profitable strategy.
  • We forecast prices with daily data. If you need intraday predictions, the data engineering and feature design changes materially.
  • We compare three modeling philosophies. Multivariate time-series dynamics (VAR), supervised regression on engineered lags (XGBoost), and decomposable trend/seasonality baselines (Prophet).
Strong opinion: the highest-leverage improvement in BTC forecasting is not “a fancier model”—it’s tighter data hygiene, leakage prevention, and evaluation discipline.

Data sources

This workflow uses BTC price history and an external macro signal that can plausibly co-move with BTC under certain market regimes (e.g., a dollar index / USD strength proxy). The goal is not to claim a causal story, but to test whether a multivariate signal improves forecast stability.

  • Bitcoin historical data (Open/High/Low/Close/Volume)
  • USD index / macro proxy (to test multivariate relationships)

If you don’t have the macro series, you can still run the full pipeline using BTC-only (Prophet + XGBoost with lags).

Baseline imports
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn.model_selection import TimeSeriesSplit
      

Preprocessing that actually holds up

Financial time series are typically non-stationary: trends and regime shifts dominate. You can either: (1) model non-stationarity explicitly (Prophet), or (2) transform the series to stabilize it (log + differencing) before fitting models like VAR.

Recommended transformations

  • Sort by time and enforce a daily frequency.
  • Forward-fill only when it’s justified (e.g., market holidays in macro series).
  • Log transform for price-like series, then difference to remove trend.
  • Split chronologically (no random train/test split).
Load, align, and build a modeling frame
# Example structure: you can adapt this to your data files.
# btc_df columns: Date, Close (and optionally Volume)
# usd_df columns: Date, Close (macro proxy)

btc_df = pd.read_csv("data/BTC-USD.csv", parse_dates=["Date"])
usd_df = pd.read_csv("data/USD.csv", parse_dates=["Date"])

btc = btc_df[["Date", "Close"]].rename(columns={"Close": "btc_close"}).sort_values("Date")
usd = usd_df[["Date", "Close"]].rename(columns={"Close": "usd_index"}).sort_values("Date")

df = (btc.merge(usd, on="Date", how="inner")
        .dropna()
        .set_index("Date")
        .asfreq("D"))

# If your macro series has gaps on weekends/holidays:
df["usd_index"] = df["usd_index"].ffill()

df.head()
      
Log + differencing (for VAR-style stationarity)
# Transform prices to stabilize variance; then difference to reduce trend.
log_df = np.log(df[["btc_close", "usd_index"]])

# First difference is often enough; second difference is sometimes used, but can over-whiten.
diff_df = log_df.diff().dropna()

diff_df.head()
      

Model 1: VAR (Vector Autoregression)

VAR is a multivariate time-series model: each variable is explained by its own lags and the lags of the other variables. It’s a solid choice when you believe your series move together over time and you want an interpretable linear baseline.

What VAR does well

  • Captures lagged cross-effects between BTC and macro signals.
  • Fast to fit and easy to diagnose.
  • Baseline you can trust before moving to heavier models.

Where it struggles: strong nonlinearity, structural breaks, and “shock” regimes (very common in crypto).

Fit VAR on differenced log series
from statsmodels.tsa.api import VAR

# Hold out the last N days for a simple backtest
N_TEST = 14
train = diff_df.iloc[:-N_TEST]
test  = diff_df.iloc[-N_TEST:]

model = VAR(train)

# Let information criteria pick a sensible lag cap
res = model.fit(maxlags=30, ic="aic")

# Forecast in transformed space (diff of log)
fc_diff_log = res.forecast(y=train.values[-res.k_ar:], steps=N_TEST)
fc_diff_log = pd.DataFrame(fc_diff_log, index=test.index, columns=train.columns)

fc_diff_log.head()
      
Invert transforms back to price space
def invert_diff_log_forecast(log_history: pd.DataFrame, fc_diff_log: pd.DataFrame) -> pd.DataFrame:
    """
    Given:
      log_history: historical log series (level), indexed by date
      fc_diff_log: forecasted first-differences of log series
    Returns:
      forecasted log levels, then exponentiated price levels.
    """
    last_log = log_history.iloc[-1]
    fc_log = fc_diff_log.cumsum().add(last_log, axis="columns")
    fc_level = np.exp(fc_log)
    return fc_level

log_history_train = log_df.loc[train.index]
var_forecast_level = invert_diff_log_forecast(log_history_train, fc_diff_log)

var_forecast_level.head()
      

Model 2: XGBoost on lag features

XGBoost is not a time-series model by default. It becomes one when you build supervised features that encode time: lags, rolling statistics, and cross-series lags (e.g., lagged USD index changes).

Why this helps

  • Nonlinear relationships (common in crypto) are easier to approximate.
  • Flexible features let you incorporate volume, volatility, and macro context.
  • Strong tabular baseline with a known bias-variance profile.
Create lag features (clean, minimal, leakage-safe)
import xgboost as xgb

# Use log returns for more stable modeling
ret = np.log(df[["btc_close", "usd_index"]]).diff().dropna()

def make_lag_features(frame: pd.DataFrame, lags=(1,2,3,7,14), roll_windows=(7,14)) -> pd.DataFrame:
    out = frame.copy()
    for col in frame.columns:
        for L in lags:
            out[f"{col}_lag{L}"] = frame[col].shift(L)
        for w in roll_windows:
            out[f"{col}_rollmean{w}"] = frame[col].rolling(w).mean()
            out[f"{col}_rollstd{w}"] = frame[col].rolling(w).std()
    return out.dropna()

feat = make_lag_features(ret)
target = feat["btc_close"]  # next-day log return target (aligned after dropna)
X = feat.drop(columns=["btc_close"])
y = target

# Chronological split (no leakage)
N_TEST = 60
X_train, X_test = X.iloc[:-N_TEST], X.iloc[-N_TEST:]
y_train, y_test = y.iloc[:-N_TEST], y.iloc[-N_TEST:]

model = xgb.XGBRegressor(
    objective="reg:squarederror",
    n_estimators=600,
    learning_rate=0.03,
    max_depth=4,
    subsample=0.9,
    colsample_bytree=0.9,
    reg_alpha=0.0,
    reg_lambda=1.0,
    random_state=42
)

model.fit(X_train, y_train)
pred_ret = pd.Series(model.predict(X_test), index=y_test.index, name="pred_btc_logret")

pred_ret.head()
      
Convert predicted returns back to a price path
# Build a forecasted price series from returns
last_price = df["btc_close"].loc[pred_ret.index.min() - pd.Timedelta(days=1)]
pred_price = (np.exp(pred_ret.cumsum()) * last_price).rename("xgb_btc_price")

pred_price.head()
      

Model 3: Prophet baseline

Prophet is useful as a decomposable baseline: it models trend + seasonality + holidays (optional) and provides uncertainty intervals. In crypto, it’s rarely “the best” model in absolute error terms across all regimes, but it is a strong governance baseline: if your fancy model can’t beat Prophet out-of-sample, it’s not production-ready.

Note: older notebooks often use fbprophet. The current package is typically prophet.

Prophet forecast (BTC close)
from prophet import Prophet

btc_prophet = df.reset_index()[["Date", "btc_close"]].rename(columns={"Date":"ds", "btc_close":"y"})

# Optional: log-transform can help for exponential growth regimes
# btc_prophet["y"] = np.log(btc_prophet["y"])

m = Prophet(daily_seasonality=True)
m.fit(btc_prophet)

future = m.make_future_dataframe(periods=60, freq="D")
forecast = m.predict(future)

forecast[["ds", "yhat", "yhat_lower", "yhat_upper"]].tail()
      

Evaluation: keep it honest

With time series, evaluation failures are usually process failures: leakage, bad splits, or comparing models in different spaces (returns vs prices) without translating consistently. The clean approach: evaluate on the same target and horizon using chronological splits.

Metrics that are meaningful

  • MAE (robust, interpretable)
  • RMSE (penalizes big misses—important in crypto)
  • Directional accuracy (optional, if you care about sign of returns)
Example: MAE/RMSE on a held-out window
def rmse(y_true, y_pred):
    return float(np.sqrt(mean_squared_error(y_true, y_pred)))

# Example evaluation for XGBoost price path (pred_price) vs actual btc_close
actual = df["btc_close"].loc[pred_price.index]
mae_val = float(mean_absolute_error(actual, pred_price))
rmse_val = rmse(actual, pred_price)

mae_val, rmse_val
      

Conclusions and recommended next upgrades

  • VAR gives a disciplined multivariate baseline—good for understanding lag relationships, weak in nonlinear shock regimes.
  • XGBoost is the best “tabular” workhorse once you engineer lag/rolling features and enforce leakage-safe splits.
  • Prophet is a governance baseline: if you can’t beat it out-of-sample, your pipeline isn’t ready.
Forward-looking upgrade path: add volatility features (ATR/rolling std), regime detection (simple clustering on volatility), and a rolling-window backtest (walk-forward validation) before you even consider deep learning.

Notebook

Full notebook (GitHub): crypto_forecast.ipynb

If you want a cleaner rendered view (no GitHub UI friction): View on nbviewer