Quantitative Finance Series: Predicting Stock Prices with Python and Monte Carlo Simulation

No of Post Views:

43 hits

Welcome to the world of quantitative finance, a fascinating field where finance, mathematics, and programming converge. What was once a domain reserved for those with access to expensive, proprietary software. This is now more accessible than ever, thanks in large part to a single, powerful tool: Python.

In this article, we’ll learn a practical application of Python in quantitative finance. We won’t just talk about theory; we will roll up our sleeves and build a Monte Carlo simulation from scratch to forecast stock prices. This is a common technique in finance, used for everything from pricing complex derivatives to assessing the risk of an entire investment portfolio.

This step-by-step guide is designed for aspiring quants, data scientists, and anyone curious about the practical intersection of coding and finance.

What is a Monte Carlo Simulation?

Monte Carlo simulation is a way to understand the impact of risk and uncertainty in a prediction model. It doesn’t give us a single, definitive answer. It runs a large number of random trials to calculate a range of possible outcomes and their probabilities.

In finance, we can’t know for certain how a stock will perform. Its path is influenced by countless unpredictable factors. But, by simulating thousands, or even millions, of potential future price paths for a stock, we can get a statistical sense of where it might end up.

The Model: Geometric Brownian Motion (GBM)

To simulate stock price movements, we need a mathematical model that can represent this random walk. The most common model used in quantitative finance for this purpose is Geometric Brownian Motion (GBM).

GBM describes a path where the change at any point is random. It has two main components:

  1. Drift (μ): This represents the average rate of return of the stock over a period. Think of it as the general direction the stock price is heading, based on its historical performance.
  2. Volatility (σ): This represents the degree of variation in the stock’s returns. A high-volatility stock will have much wilder (less predictable) price swings than a low-volatility one.

The formula for predicting the stock price at the next time step is:

S_t = S_{t-1} * e^((μ – 0.5 * σ^2) * Δt + σ * W_t)

Where:

  • S_t is the stock price at time t.
  • S_{t-1} is the stock price at the ‘t-1’.
  • e is the exponential constant.
  • μ (mu) is the drift.
  • σ (sigma) is the volatility.
  • Δt (delta t) is the time step (e.g., one day).
  • W_t is the “random” part, a random variable drawn from a normal distribution with a mean of 0 and a standard deviation of sqrt(Δt).

Don’t worry if the math seems intimidating. The beauty of Python is that we can translate this formula into code quite easily.

Let’s Get Coding: Building Our Simulation

Now, let’s build our simulation step-by-step.

Prerequisites

First, you need to have Python installed on your system. You’ll also need a few key libraries. If you don’t have them, you can install them using pip:

pip install numpy pandas yfinance matplotlib
  • numpy: The fundamental package for numerical computation in Python.
  • pandas: An essential library for data manipulation and analysis, especially for time-series data.
  • yfinance: A handy tool to fetch up-to-date financial data from Yahoo Finance.
  • matplotlib: The most widely used library for creating plots and visualizations in Python.
Step 1: Fetching Historical Stock Data

We need historical data to calculate our drift and volatility. Let’s pick a well-known stock, like Apple Inc. (AAPL), and download its data from Yahoo Finance.

import pandas as pd
import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
from datetime import datetime

# Define the stock ticker and data source
ticker = 'AAPL'

# Define the start and end dates for our historical data
start_date = datetime(2015, 1, 1)
end_date = datetime(2024, 12, 31)

# Fetch the data
stock_data = yf.download(ticker, start=start_date, end=end_date)

# We only need the 'Adj Close' price
adj_close = stock_data['Close']

print("Last 5 days of historical data:")
print(adj_close.tail())

This script will download the daily adjusted closing prices for AAPL for the last several years.

Step 2: Calculating Drift and Volatility

Next, we use this historical data to estimate the two key parameters of our GBM model: drift and volatility.

# Calculate daily logarithmic returns
log_returns = np.log(1 + adj_close.pct_change())
log_returns.dropna(inplace=True)

# Calculate drift (average daily return)
mu = float(log_returns.mean())
print(f"Drift (μ): {mu:.6f}")

# Calculate volatility (standard deviation of daily returns)
sigma = float(log_returns.std())
print(f"Volatility (σ): {sigma:.6f}")

Here, we first calculate the daily percentage change in the stock price. We then take the natural logarithm of these returns. Log returns are commonly used in financial modeling because they are time-additive and make certain mathematical assumptions more acceptable.

The drift (μ) is simply the average of these daily log returns. The volatility (σ) is the standard deviation of these returns, which measures their dispersion around the average.

Step 3: Building the Monte Carlo Simulation Engine

This is the core of our project. We’ll create a function that takes our parameters and simulates future price paths.

def monte_carlo_simulation(start_price, days, mu, sigma):
    dt = 1 # Time step is one day
    
    price_path = np.zeros(days)
    price_path[0] = start_price
    
    # Generate random values for each day
    shocks = np.random.normal(loc=mu, scale=sigma, size=days)
    
    for t in range(1, days):
        # Apply the GBM formula
        price_path[t] = price_path[t-1] * np.exp(shocks[t-1])
        
    return price_path

Let’s break down this function. It generates a series of daily “shocks” from a normal distribution using our historical mu and sigma. Then, it iterates through each future day, calculating the new price by multiplying the previous day’s price by the exponentiated random shock. This is the GBM formula in action.

Note: A more precise application of the GBM formula is price_path[t-1] * np.exp((mu - 0.5 * sigma**2) * dt + sigma * np.random.normal(0, np.sqrt(dt))). The version in the code is a simplified form that is often sufficient and easier to understand for illustrative purposes.

Step 4: Running the Simulation Multiple Times

A single simulation is just one possible future. The power of Monte Carlo comes from running many simulations to see the full range of possibilities.

# Simulation parameters
num_simulations = 1000
simulation_days = 252 # One trading year

# Get the last known stock price
last_price = float(adj_close.iloc[-1])

# Store all simulation results
all_simulations = np.zeros((num_simulations, simulation_days))

# Run the simulation
for i in range(num_simulations):
    all_simulations[i, :] = monte_carlo_simulation(last_price, simulation_days, mu, sigma)

Here, we set up our simulation to run 1000 times, each for a period of 252 days (the approximate number of trading days in a year). We create a NumPy array to hold the results of every single simulation path.

Step 5: Visualizing and Analyzing the Results

Now for the exciting part: seeing what our simulation produced. A picture is worth a thousand words, and in quantitative finance, it’s worth a thousand simulations.

This code will generate a plot showing all 1000 simulated price paths. You’ll see a fan of possible futures, giving you an intuitive feel for the potential upside and downside risk.

# Plot the simulation results
plt.figure(figsize=(12, 7))
plt.plot(all_simulations.T) # Transpose for plotting
plt.title(f'{ticker} Monte Carlo Simulation ({num_simulations} Trials)')
plt.xlabel(f'Trading Days from Today (Starting Price: ${last_price:.2f})')
plt.ylabel('Predicted Stock Price ($)')
plt.grid(True, linestyle='--', alpha=0.6)
plt.show()

Output:

While the “spaghetti plot” is visually impressive, a histogram of the final prices is often more useful for analysis. It shows the probability distribution of the stock’s price at the end of our simulation period.

# Get the final prices from each simulation
final_prices = all_simulations[:, -1]

# Plot the distribution of final prices
plt.figure(figsize=(10, 6))
plt.hist(final_prices, bins=50, edgecolor='black')
plt.title(f'Distribution of {ticker} Final Price after {simulation_days} Days')
plt.xlabel('Final Stock Price ($)')
plt.ylabel('Frequency')
plt.axvline(final_prices.mean(), color='r', linestyle='dashed', linewidth=2, label=f'Expected Price: ${final_prices.mean():.2f}')
plt.axvline(last_price, color='g', linestyle='dashed', linewidth=2, label=f'Current Price: ${last_price:.2f}')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)
plt.show()

# Print some key statistics
print("Simulation Analysis:")
print(f"Expected Stock Price after {simulation_days} days: ${final_prices.mean():.2f}")
print(f"Median Stock Price: ${np.median(final_prices):.2f}")
print(f"5% Quantile (Value at Risk): ${np.percentile(final_prices, 5):.2f}")
print(f"95% Quantile: ${np.percentile(final_prices, 95):.2f}")

Output:

Simulation Analysis:
Expected Stock Price after 252 days: $327.66
Median Stock Price: $315.59
5% Quantile (Value at Risk): $203.53
95% Quantile: $493.95

This histogram tells us a story. The center of the distribution gives us the most likely outcomes. The “Expected Price” (the mean of all final prices) is our model’s single best guess for the future price.

More importantly, the spread of the distribution quantifies risk. We can calculate quantiles to make probabilistic statements. For example:

  • Value at Risk (VaR): The 5% quantile tells us that there is a 5% chance the stock price will fall below this value.
  • Confidence Interval: The range between the 5% and 95% quantiles gives us a 90% confidence interval for the final stock price.
Conclusion and Next Steps

Congratulations! You’ve just built a functional Monte Carlo simulation in Python to forecast stock prices. You’ve fetched real financial data, calculated key statistical parameters, built a simulation engine based on a core financial model, and visualized the results to make a probabilistic forecast.

This is more than just an academic exercise. This exact methodology is the foundation for many real-world applications:

  • Option Pricing: Monte Carlo methods are essential for pricing complex “exotic” options that don’t have simple closed-form solutions like the Black-Scholes model.
  • Risk Management: Banks and hedge funds use these simulations to calculate VaR and other risk metrics for their massive portfolios.
  • Portfolio Optimization: You can extend this simulation to an entire portfolio of assets to understand how they might perform together.

Of course, this model has its limitations. Geometric Brownian Motion assumes that returns are normally distributed and that volatility is constant, neither of which is strictly true in the real world. More advanced models (like GARCH or jump-diffusion models) can account for these complexities.

But by building this tool, you’ve taken a significant step into the world of quantitative finance. You’ve seen how Python can be used to translate financial theory into a practical, data-driven application. The journey of a quant is one of continuous learning, and this is a fantastic place to start.

From here, you could try:

  • Running the simulation on different stocks or for different time horizons.
  • Comparing the simulated distribution to the actual price movement over the next year.
  • Using the simulation to price a simple European call option.

The power is now at your fingertips. Happy learning!


Leave a Reply

Discover more from SimplifiedZone

Subscribe now to keep reading and get access to the full archive.

Continue reading

Discover more from SimplifiedZone

Subscribe now to keep reading and get access to the full archive.

Continue reading