“Where is the stock going next week?”: Forecasting, Mechanics, and the Volatility Puzzle

No of Post Views:

37 hits

Financial Econometrics: Part 15

Prerequisites: ARIMA Framework

Introduction: The “So What?” Moment

We have spent many articles building a rigorous mathematical toolkit. We know how to strip trends using differencing (\(d\)), capture momentum with Autoregression (\(p\)), and model shocks with Moving Averages (\(q\)). We have learned the Box-Jenkins dance of Identification, Estimation, and Diagnostics.

But your boss doesn’t care about AIC scores or Ljung-Box p-values. They care about one thing: “Where is the stock going next week?”

In this final article, we will open the “black box” of the ARIMA model to understand how it actually generates predictions. We will also cover the critical theoretical conditions that keep our models from exploding and conclude with the single biggest limitation of ARIMA in finance: its inability to measure fear (volatility).

Part 1: The Fine Print – Causality and Invertibility

Before we trust a model to predict the future, we must ensure it is mathematically stable. In Time Series theory, stability is governed by two twin concepts: Causality and Invertibility. You will often see these terms in software warnings (e.g., “Roots outside the unit circle”), so it is vital to understand them intuitively.

1. Causality (The Stability of AR)

Causality doesn’t just mean “cause and effect.” In this context, it means that the current value depends only on the past, not the future.

For an AR(1) model (\(Y_t = \phi Y_{t-1} + \epsilon_t\)) to be causal and stationary, the coefficient must satisfy \(|\phi| < 1\).

  • If \(|\phi| < 1\): The influence of the past fades away. A shock from 100 days ago matters almost zero today. This is realistic.
  • If \(|\phi| > 1\): The process is Explosive. A shock from 100 days ago has grown exponentially and now dominates the system. This describes a bubble, not a stable market.

2. Invertibility (The Uniqueness of MA)

Invertibility is a bit more abstract. It applies to Moving Average (MA) models.

Technically, an MA process is not unique. You can produce the exact same Autocorrelation Function (ACF) with \(\theta = 0.5\) as you can with \(\theta = 2.0\).

However, if \(\theta = 2.0\), the weights placed on past errors increase as we go back in time. This implies that a random error from 1950 has more influence on today’s price than an error from yesterday. That makes no sense.

The Rule: We always choose the Invertible solution (where \(|\theta| < 1\)). This ensures that recent information is weighted more heavily than ancient history.

Part 2: Under the Hood – Estimation Methods

When you type model.fit() in Python, how does the computer actually find the best numbers for \(\phi\) and \(\theta\)? It generally uses one of two methods:

1. Method of Moments (Yule-Walker Equations)

This is the “old school” analytical method. It relies on the fact that there is a direct link between the model parameters (\(\phi\)) and the observed correlations (ACF).

  • Intuition: If we know the correlation between Today and Yesterday is 0.8, we can mathematically solve for \(\phi\).
  • Pros: Fast and computationally cheap.
  • Cons: Not very accurate for MA models; can fail with small sample sizes.

2. Maximum Likelihood Estimation (MLE)

This is the modern standard (and the default in most Python libraries).

  • Intuition: The computer guesses a set of parameters (e.g., \(\phi=0.5\)), calculates the probability (likelihood) of observing your specific dataset given those parameters, and then iteratively tweaks the parameters to maximize that probability.
  • Pros: precise, works for complex models (ARIMA, GARCH), and provides standard errors.
  • Cons: Computationally intensive (requires an iterative solver).

Part 3: The Art of Forecasting

Forecasting is not about telling the future; it is about projecting the Expected Value (\(E[Y_{t+h}]\)) and the Uncertainty (\(Var(Y_{t+h})\)).

The Mechanism: Mean Reversion

Most financial ARIMA models (for returns) are stationary. This means they are Mean Reverting.

  • Short Term (1-2 days): The forecast is driven by recent momentum (AR) and recent shocks (MA).
  • Long Term (10+ days): The influence of the past fades. The forecast line will simply converge to the long-term average (usually zero for returns).

This is why you should be skeptical of long-term ARIMA forecasts. They usually just say, “In the long run, things will be average.”

The Fan Chart (Confidence Intervals)

The most valuable output of an ARIMA model is not the line, but the Confidence Interval (the shaded region).

  • As we forecast further into the future, our uncertainty grows.
  • This creates a “Fan Chart” or “Cone of Uncertainty.”
  • If the price stays within this cone, the model is working. If it breaks out, a structural break has likely occurred.

Part 4: Practical Lab – Generating the Forecast

Let’s take the Google stock model we identified in last article and generate a 10-day forecast with confidence intervals.

Get sample data

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
import seaborn as sns

sns.set(style="whitegrid")

# 1. Load Data and Prepare Returns
df = pd.read_csv('data2.csv')
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
df['Log_Ret'] = np.log(df['GOOGLE']).diff().dropna()

# 2. Fit the Model (Using a simple ARIMA(1,0,1) for demonstration)
# Ideally, use the best order found in your Grid Search from Article 3
model = ARIMA(df['Log_Ret'], order=(1, 0, 1))
results = model.fit()

# 3. Generate Forecasts
# We forecast the next 10 days
forecast_steps = 10
forecast_obj = results.get_forecast(steps=forecast_steps)

# Extract forecast values and confidence intervals
forecast_values = forecast_obj.predicted_mean
conf_int = forecast_obj.conf_int(alpha=0.05)  # 95% Confidence

# 4. Visualization
plt.figure(figsize=(12, 6))

# Plot the last 30 days of actual data
plt.plot(df.index[-30:], df['Log_Ret'].tail(30), label='Historical Returns', color='blue')

# Create a future date range for plotting
last_date = df.index[-1]
future_dates = pd.date_range(start=last_date + pd.Timedelta(days=1), periods=forecast_steps, freq='B')

# Plot the Forecast
plt.plot(future_dates, forecast_values, label='Forecast Mean', color='red', linestyle='--')

# Plot the Confidence Interval (The Cone)
plt.fill_between(future_dates, 
                 conf_int.iloc[:, 0], 
                 conf_int.iloc[:, 1], 
                 color='pink', alpha=0.3, label='95% Confidence Interval')

plt.title('Google Stock Returns: 10-Day ARIMA Forecast', fontsize=14)
plt.legend(loc='upper left')
plt.show()

A graph displaying the 10-day ARIMA forecast for Google stock returns, featuring blue lines for historical returns, a dashed red line for forecast mean, and a shaded pink area representing the 95% confidence interval.

Note:

# Notice how the red line (forecast) quickly flattens out to zero? That is Mean Reversion in action.

# The value of the model is in the pink band, telling you the expected RANGE of volatility.

Part 5: The Final Limitation – The Volatility Puzzle

As we conclude this article, we must address the elephant in the room. Look at your residuals or returns plot again. You will see Volatility Clustering:

  • Long periods of calm (small changes).
  • Sudden bursts of panic (large positive AND negative changes).

The Failure of ARIMA

ARIMA assumes that the variance ($sigma^2$) is constant (Homoskedasticity). It assumes the “risk” today is the same as the “risk” in 2008. We know this is false. In finance, risk itself is a time series. It trends. It spikes. It clusters.

To model this, we need a model not for the *mean* ($Y_t$), but for the *variance* ($sigma_t^2$). This leads us to the Nobel Prize-winning ARCH (Autoregressive Conditional Heteroskedasticity) and GARCH models. Understanding that ARIMA forecasts the direction while GARCH forecasts the risk is the key to becoming a complete Financial Econometrician.

Conclusion

Congratulations! You have navigated the complex world of Time Series Analysis.

1.  You learned to see time as a dimension and Stationarity as the golden rule.

2.  You mastered the building blocks: White Noise, Random Walks, and MA processes.

3.  You combined them into the ARIMA framework using the Box-Jenkins methodology.

4.   You learned how to forecast, ensure stability, and recognize the limits of your tools.

You now possess the vocabulary and the coding skills to analyze financial data dynamically. The next step in your journey would be Volatility Modeling (GARCH) and Multivariate Time Series (VAR), where assets influence each other.

Keep practicing, keep coding, and remember: In finance, the past is the only guide we have to the future.


Leave a Reply

Discover more from SimplifiedZone

Subscribe now to keep reading and get access to the full archive.

Continue reading

Discover more from SimplifiedZone

Subscribe now to keep reading and get access to the full archive.

Continue reading