Financial Econometrics: Part 5
Welcome to the first article in our series on advanced regression! If you’ve ever worked with data, you’ve almost certainly used Ordinary Least Squares (OLS) regression. It’s the undisputed workhorse of econometrics and data science, the first tool we all learn. We love OLS because it’s simple, interpretable, and, under the right conditions, remarkably powerful.
The magic of OLS is captured in the Gauss-Markov Theorem. This theorem promises that if a set of “commandments”—known as the OLS assumptions—are respected, the OLS estimator is BLUE: the Best Linear Unbiased Estimator.
- Unbiased means that, on average, our estimates of the coefficients will be correct.
- Best means it has the minimum variance among all linear unbiased estimators. In simple terms, our estimates will be the most precise and reliable.
But here’s the catch: the real world is messy. Financial data, in particular, loves to break the rules. When the OLS assumptions crumble, the “BLUE” promise is broken. Our OLS model might still be unbiased, but it’s no longer the “best.” Its standard errors become unreliable, leading us to draw false conclusions—a disastrous outcome in a field where decisions are worth millions.
In this article, we’ll tackle one of the most frequently violated assumptions: homoscedasticity. We’ll learn what it is, why its absence (a condition called heteroscedasticity) is a problem, how to detect it, and, most importantly, how to fix it using a powerful alternative: Weighted Least Squares (WLS).
The OLS “Rulebook”: What Are the Assumptions?
Before we can talk about breaking the rules, we need to know what they are. While different textbooks list them slightly differently, the core assumptions for OLS to be BLUE are:
- Linearity: The model is linear in its parameters. \(Y = beta_0 + beta_1 X_1 + … + e\).
- Strict Exogeneity: The error term \(e\) has a conditional mean of zero. This is a strong way of saying the independent variables (X) aren’t correlated with the error term.
- No Perfect Multicollinearity: None of the independent variables is a perfect linear combination of the others. (You can’t have “height in inches” and “height in centimeters” in the same model).
- Spherical Errors: This is the one we’re interested in today. It’s a two-part assumption:
- Homoscedasticity: The variance of the error term (e) is constant for all observations. \(Var(e_i | X) = sigma^2\).
- No Autocorrelation: The error terms are uncorrelated with each other. \(Cov(e_i, e_j | X) = 0\) for \(i \neq j \).
Today, we’re putting homoscedasticity under the microscope.
Homoscedasticity vs. Heteroscedasticity: “Same Scatter” vs. “Different Scatter”
Let’s break down the jargon.
- Homo = Same
- Hetero = Different
- Scedasticity = Scatter (or Variance)
Homoscedasticity (“Same Scatter”): This is the ideal state. It means that the variability of the error term (the “scatter” of the data points around the regression line) is the same, regardless of the value of the independent variable.
Imagine you’re modeling income based on years of education. Homoscedasticity would imply that the spread (variance) of incomes for people with 10 years of education is the same as the spread of incomes for people with 20 years of education.
Heteroscedasticity (“Different Scatter”): This is the problematic, real-world scenario. It means the variance of the error term changes as the independent variable changes.
In our income-education model, a much more realistic scenario is heteroscedasticity. The variance of income for people with 20 years of education (e.g., doctors, lawyers, engineers) is likely much larger than the variance of income for people with 10 years of education. The data “fans out” as education increases.
This is incredibly common in financial data. The variance of a stock’s return is often not constant; it might be low during stable periods and extremely high during a financial crisis. The variance of a currency’s exchange rate (like the DXY index we’ll see later) might be larger when other economic indicators (like oil prices) are also volatile.
So What? Why Is Heteroscedasticity a Problem?
When heteroscedasticity is present, the Gauss-Markov theorem is violated. Here’s what happens:
- Coefficient estimates (\(\beta\)) are still unbiased. This is a small comfort. On average, your \(\beta_1\) is still correct.
- The OLS estimator is no longer “Best” (efficient). Another, better estimator exists (WLS!).
- The Standard Errors are biased and incorrect. This is the critical problem.
The standard errors are the “measure of uncertainty” for our coefficients. They are the denominator in the t-statistic \(t = \beta / SE_{\beta}\), which we use to calculate p-values.
If the standard errors are wrong, our t-statistics and p-values are meaningless.
- We might get a tiny p-value and conclude a variable is “highly significant” when, in reality, its true standard error is large and the variable is not significant at all (a Type I error).
- We might get a large p-value and conclude a variable is “not significant” when its true standard error is small and the variable is, in fact, crucial (a Type II error).
In finance, making a decision based on a faulty p-value is a recipe for disaster. We must know if our standard errors are reliable.
Detecting the Enemy: The Breusch-Pagan Test
How do we know if we have heteroscedasticity? We can (and should) visually inspect plots of the residuals, but a formal statistical test is better. The most common is the Breusch-Pagan Test.
Here’s the intuitive, step-by-step logic:
- Run your OLS regression: Run \(Y = \beta_0 + \beta_1 X_1 + … + e\) and get the residuals, \(e_i\), for each observation.
- Square the residuals: Calculate \(e_i^2\). This \(e_i^2\) is our best proxy for the variance at each data point.
- Ask: “Is the variance related to the X’s?” We can test this by running an auxiliary regression. We try to model the squared residuals (our proxy for variance) using our original independent variables.
- Auxiliary Model: \(e_i^2 = \alpha_0 + \alpha_1 X_{1i} + \alpha_2 X_{2i} + … + v_i\)
- Test the auxiliary model: If our original X variables have no power to explain the variance (the \(e_i^2\)), then all the \(\alpha\) coefficients (except the constant) should be zero. This would mean the variance is just a constant (\(\alpha_0\)), which is exactly the definition of homoscedasticity.
- Null Hypothesis (\(H_0\)): Homoscedasticity is present. (All \(\alpha_1, \alpha_2, …\) are zero).
- Alternative Hypothesis (\(H_A\)): Heteroscedasticity is present. (At least one \(\alpha\) is not zero).
- Get the test statistic: The test calculates a statistic (an F-statistic or an LM statistic, \(n \times R^2\) from the auxiliary regression) that follows a chi-squared distribution.
- Find the p-value: We get a p-value from this test.
- High p-value (> 0.05): We fail to reject the null hypothesis. We conclude that homoscedasticity is present. (Phew!)
- Low p-value (< 0.05): We reject the null hypothesis. We conclude that heteroscedasticity is present. (Houston, we have a problem.)
The Solution: Weighted Least Squares (WLS)
If we detect heteroscedasticity, we can’t trust our OLS results. We need a new tool. Enter Weighted Least Squares (WLS).
The logic is beautifully simple. OLS is problematic because it gives equal weight to every observation. But if our data has heteroscedasticity, some observations are “noisier” or less reliable than others (i.e., they come from a distribution with a higher variance).
WLS’s brilliant idea: What if we give less weight to the noisy, high-variance observations and more weight to the stable, low-variance observations?
- OLS minimizes: \(sum (Y_i – \hat{Y}_i)^2\)
- WLS minimizes: \(sum w_i (Y_i – \hat{Y}_i)^2\)
The \(w_i\) is the “weight” for the i-th observation. OLS is just a special case of WLS where \(w_i = 1\) for all i.
The key, then, is finding the right weights. The theory shows that the optimal weight \(w_i\) is the inverse of the variance of the error term for that observation.
\(w_i = \frac{1}{\sigma_i^2}\)
This makes perfect sense: if an observation has high variance (e.g., \(\sigma_i^2 = 10\)), it gets a low weight (\(w_i = 0.1\)). If it has low variance (e.g., \(\sigma_i^2 = 0.5\)), it gets a high weight (\(w_i = 2\)).
A Practical Recipe for WLS
Okay, but we don’t know the true variance \(sigma_i^2\) for each observation. We have to estimate it! This leads to a common procedure for WLS:
- Run OLS (Model 1): Run \(Y = \beta_0 + \beta_1 X_1 + … + e\).
- Get residuals: \(\hat{e}_i = Y_i – \hat{Y}_i\).
- Model the variance: We need a good estimate of \(\sigma_i^2\). The Breusch-Pagan test already gave us a hint: we can model the variance using our X variables. A common and flexible method is to model the log of the squared residuals (which helps ensure positive variance estimates).
- Run Auxiliary Model: \(log(\hat{e}_i^2) = \alpha_0 + \alpha_1 X_{1i} + … + v_i\)
- Get fitted values: From this auxiliary model, get the fitted values, let’s call them \(hat{g}_i$.
- \(\hat{g}_i = \hat{\alpha}_0 + \hat{\alpha}_1 X_{1i} + …\)
- Calculate the weights: Since \(\hat{g}_i\) is our estimate for \(log(\sigma_i^2)\), our estimate for the variance itself is \(h_i = exp(\hat{g}_i)\). The weights are the inverse of this.
- \(w_i = 1 / h_i = 1 / exp(\hat{g}_i)\)
- Run WLS (Model 2): Now, run the final regression using these weights. Most statistical software (including Python’s statsmodels) has a specific WLS function that lets you pass in the Y vector, the X matrix, and your calculated weights vector.
The resulting WLS model will have coefficients that are still unbiased, but are now also efficient (they are the “Best” linear unbiased estimators again!). Most importantly, the standard errors from this model are correct and reliable.
Practical Application: Modeling the U.S. Dollar Index (DXY)
Let’s make this real. We’ll use the data from M2_data.csv to build a model for the U.S. Dollar Index (DXY). We’ll try to explain its daily changes using the changes in:
- METALS (a metals commodity index)
- OIL
- US_STK (U.S. stocks)
- INTL_STK (International stocks)
- X10Y_TBY (10-year Treasury yields)
- EURUSD (Euro/USD exchange rate)
The Python Code Walkthrough
Here is the complete Python code to perform this analysis, from start to finish. I’ve added extensive comments to explain each step, just as a trainer would.
import pandas as pd
import numpy as np
import statsmodels.api as sm
import statsmodels.stats.api as sms
# 1. LOAD AND PREPARE THE DATA
print("--- 1. Loading Data ---")
# Load the dataset
data = pd.read_csv('M2_data.csv')
# Convert 'Date' to datetime objects
data['Date'] = pd.to_datetime(data['Date'])
# Set the Date as the index
data = data.set_index('Date')
# Define our dependent (Y) and independent (X) variables
# Y = DXY (US Dollar Index)
# X = Everything else (except YEAR)
dependent_var = 'DXY'
independent_vars = ['METALS', 'OIL', 'US_STK', 'INTL_STK', 'X10Y_TBY', 'EURUSD']
# Drop any rows with missing data
data_clean = data[[dependent_var] + independent_vars].dropna()
# Prepare Y and X for statsmodels
Y = data_clean[dependent_var]
X = data_clean[independent_vars]
# Add a constant (intercept) to the X matrix
X = sm.add_constant(X)
print(f"Data loaded. Modeling {dependent_var} with {len(independent_vars)} predictors.")
print(f"Total observations: {len(Y)}")
print("-" * 30 + "\n")
# 2. RUN INITIAL OLS REGRESSION (MODEL 1)
print("--- 2. Running Initial OLS Regression (Model 1) ---")
ols_model = sm.OLS(Y, X)
ols_results = ols_model.fit()
# Print the OLS summary
print(ols_results.summary())
print("-" * 30 + "\n")
# 3. TEST FOR HETEROSCEDASTICITY (BREUSCH-PAGAN TEST)
print("--- 3. Testing for Heteroscedasticity (Breusch-Pagan) ---")
# Get the residuals from our OLS model
ols_residuals = ols_results.resid
# Run the Breusch-Pagan test
# The 'het_breuschpagan' function takes the residuals and the X matrix
# It returns:
# 1. LM statistic
# 2. p-value for the LM statistic
# 3. F-statistic
# 4. p-value for the F-statistic
bp_test = sms.het_breuschpagan(ols_residuals, X)
labels = ['LM Statistic', 'LM Test p-value', 'F-Statistic', 'F-Test p-value']
# Print the results as a clean dictionary
bp_results = dict(zip(labels, bp_test))
print("Breusch-Pagan Test Results:")
print(bp_results)
# Interpret the results
if bp_results['LM Test p-value'] < 0.05:
print("\nInterpretation: The p-value is less than 0.05.")
print("We REJECT the null hypothesis of homoscedasticity.")
print("Conclusion: HETEROSCEDASTICITY IS PRESENT.")
print("The OLS standard errors are unreliable. We must use WLS.")
else:
print("\nInterpretation: The p-value is greater than 0.05.")
print("We FAIL to reject the null hypothesis of homoscedasticity.")
print("Conclusion: No significant heteroscedasticity detected.")
print("-" * 30 + "\n")
# 4. PERFORM WEIGHTED LEAST SQUARES (WLS) (MODEL 2)
# We only do this if heteroscedasticity was found, but we'll run it
# for demonstration purposes regardless.
print("--- 4. Running Weighted Least Squares (WLS) Regression ---")
# Step 4a: Get squared residuals from OLS
ols_resid_sq = ols_residuals**2
# Step 4b: Run auxiliary regression to model the variance
# We model the log of the squared residuals to ensure positive variance estimates
# Note: We add a small constant (1e-8) to avoid log(0)
aux_Y = np.log(ols_resid_sq + 1e-8)
aux_X = X # Use the same X matrix
aux_model = sm.OLS(aux_Y, aux_X)
aux_results = aux_model.fit()
# Step 4c: Get the fitted values from the auxiliary regression
# These are our estimates for log(variance)
log_variance_hat = aux_results.fittedvalues
# Step 4d: Calculate the weights
# The variance estimate is exp(log_variance_hat)
# The weight is 1 / variance
weights = 1.0 / np.exp(log_variance_hat)
# Step 4e: Run the final WLS regression using these weights
wls_model = sm.WLS(Y, X, weights=weights)
wls_results = wls_model.fit()
print("WLS Model (Model 2) complete. See summary below.")
print("-" * 30 + "\n")
# 5. COMPARE OLS AND WLS RESULTS
print("--- 5. Final Model Comparison ---")
print("\n*** OLS MODEL (MODEL 1) SUMMARY ***")
print("(Unreliable Standard Errors)")
print(ols_results.summary())
print("\n\n*** WLS MODEL (MODEL 2) SUMMARY ***")
print("(Reliable Standard Errors)")
print(wls_results.summary())
print("\n\n--- End of Analysis ---")
Interpreting the Results: OLS vs. WLS
When you run the code, the first thing to note is the Breusch-Pagan test result. The output shows an LM Test p-value of 0.0445 (and an F-Test p-value of 0.0432). Since this is less than our 0.05 significance level, it confirms our fears: the data is heteroscedastic. The OLS results we first looked at are built on a foundation of sand, and their standard errors are unreliable.
Now, let’s compare the OLS and WLS summaries side-by-side, which is where the real story is.
- Coefficients ($beta$): Look at the coef column for both models. As expected, the coefficients themselves are very similar (e.g., METALS changed from -0.0538 to -0.0532; INTL_STK changed from -0.3337 to -0.2846). Both OLS and WLS are unbiased, so they are arriving at similar point estimates.
- Standard Errors (std err): This is the key. Compare the std err column. They are different! For example, the standard error for METALS decreased (from 0.009 to 0.008), while the standard error for INTL_STK increased (from 0.045 to 0.046). The WLS standard errors are the correct, reliable estimates of uncertainty. The OLS ones were wrong.
- t-statistics and p-values (P>|t|): This is the “so what?” moment. Because the standard errors changed, the p-values have changed, leading to different conclusions!
- Look at X10Y_TBY (10-year Treasury):
- In the OLS model, its p-value was 0.084. At a 5% significance level, we would conclude this variable is not statistically significant and might drop it from our model.
- In the WLS model, its p-value is 0.014! By correcting for heteroscedasticity, WLS reveals that the 10-year Treasury is a statistically significant predictor of the DXY’s movement. The OLS model committed a Type II error (a false negative), incorrectly dismissing a significant variable.
- Look at EURUSD:
- In the OLS model, its p-value was 0.127 (highly insignificant).
- In the WLS model, its p-value is 0.059. While still just outside the 5% level, it is now significant at the 10% level and much closer to traditional significance. This tells us OLS was overstating the uncertainty (and understating the t-statistic) for this variable as well.
- Look at OIL:
- In the OLS model, its p-value was 0.026 (significant).
- In the WLS model, its p-value is 0.033. It remains significant, but the OLS model was slightly overstating its significance.
- Look at X10Y_TBY (10-year Treasury):
The WLS p-values are the ones you should trust and use for making decisions. In this case, WLS saved us from incorrectly throwing away a valuable predictor (X10Y_TBY).
A final, critical warning: Do not compare the R-squared or Adj. R-squared between the OLS and WLS models. They are not comparable. The R-squared in WLS is a “weighted” R-squared, and it doesn’t have the same “percent of variance explained” interpretation. The goal of WLS is not to get a higher R-squared; it’s to get reliable coefficients and standard errors.
Conclusion
We’ve taken a massive step beyond “just run OLS.” We’ve learned that regression models are built on a foundation of assumptions, and when those assumptions are violated, our results can be misleading.
We’ve demystified heteroscedasticity (unequal variance), learned how to detect it with the Breusch-Pagan Test, and mastered a powerful solution: Weighted Least Squares (WLS). By giving more weight to reliable data points and less weight to noisy ones, WLS restores our ability to get efficient estimates and, most importantly, reliable standard errors.
But this is just the first challenge. Heteroscedasticity isn’t the only monster lurking in our data. What happens when our data isn’t just “noisy” but contains “wild” outliers that can single-handedly pull our regression line off-course?
For that, WLS isn’t enough. We’ll need a different, more “robust” tool. And that will be the topic of our next article. Stay tuned!

