Welcome to the final article in our foundational series, Python for Quants. We’ve come a long way: we set up our environment, learned to think about code like a quant, and mastered the data handling power of Pandas.
Today, we bring it all together. Our goal is to perform Exploratory Data Analysis (EDA) on our financial data. This is the process where we move from just holding data to understanding it. We’ll calculate key financial metrics, analyze relationships between assets, and create the insightful visualizations that are the hallmark of a professional quant.
From Prices to Returns: The First Step of Analysis
Absolute price is interesting, but for most financial analysis, we care about returns. Returns normalize the data and are the primary input for risk and strategy models. The most common type is the daily percentage change. Pandas makes this incredibly easy.
Let’s fetch data for a few tech stocks and the SPY ETF (which tracks the S&P 500) to analyze their relationships.
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
import seaborn as sns
# Define tickers and date range
tickers = ['AAPL', 'MSFT', 'GOOG', 'SPY']
start_date = '2022-01-01'
end_date = '2023-12-31'
# Fetch adjusted close prices
adj_close_df = yf.download(tickers, start=start_date, end=end_date)['Close']
# Calculate daily percentage returns using the built-in pct_change() method
returns_df = adj_close_df.pct_change().dropna()
print("Adjusted Close Prices (last 5 days):")
print(adj_close_df.tail())
print("nDaily Returns (last 5 days):")
print(returns_df.tail())
Step-by-Step Explanation:
- We download the ‘Adj Close’ price for our list of tickers. Pandas handles the multiple tickers gracefully, creating a DataFrame with one column per stock.
.pct_change()calculates the percentage change from the previous row for each column. This one command saves us from writing a manual loop..dropna()removes the first row of the returns DataFrame, which will beNaNsince there’s no prior day to calculate a return from.
Expected Output:
Adjusted Close Prices (last 5 days):
[*********************100%***********************] 4 of 4 completed
AAPL GOOG MSFT SPY
Date
2023-12-22 193.600006 142.720001 374.579987 473.649994
2023-12-26 193.050003 142.820007 374.660004 475.640015
2023-12-27 193.149994 141.490005 374.070007 476.510010
2023-12-28 193.580002 141.279999 375.279999 476.679993
2023-12-29 192.529999 140.929993 376.040009 475.309998
Daily Returns (last 5 days):
AAPL GOOG MSFT SPY
Date
2023-12-22 0.005505 0.007409 0.002248 0.001938
2023-12-26 -0.002841 0.000701 0.000214 0.004199
2023-12-27 0.000518 -0.009312 -0.001575 0.001827
2023-12-28 0.002226 -0.001484 0.003235 0.000357
2023-12-29 -0.005424 -0.002477 0.002025 -0.002874
Visualizing Relationships: The Correlation Heatmap
How do these assets move in relation to one another? Correlation is the statistical measure for this, and it’s the foundation of modern portfolio theory. A heatmap is the most intuitive way to visualize a correlation matrix.
We’ll use Seaborn, a high-level plotting library built on Matplotlib, which makes statistical plots like this a breeze.
# Calculate the correlation matrix
correlation_matrix = returns_df.corr()
# Plot the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5)
plt.title('Correlation Matrix of Daily Returns', fontsize=16)
plt.show()
Step-by-Step Explanation:
returns_df.corr()calculates the pairwise correlation of columns, returning a new DataFrame (the correlation matrix).sns.heatmap(...)is the Seaborn function to create the plot.annot=Truedisplays the numerical correlation values on the map.cmap='coolwarm'sets the color scheme, where hot colors (red) mean high positive correlation and cool colors (blue) mean low or negative correlation.
Expected Output:

[Image: The correlation heatmap showing high correlation between the tech stocks and between each stock and SPY.]
From the heatmap, we can instantly see that the tech stocks are all highly correlated with each other (values > 0.6) and with the broader market (SPY). This tells a quant that holding only these three stocks doesn’t provide much diversification.
Visualizing Trends: Rolling Statistics
To smooth out short-term price volatility and identify underlying trends, quants use rolling statistics. The most common is the Simple Moving Average (SMA). Let’s plot the 50-day and 200-day SMAs for Microsoft, a classic technical indicator setup.
# Isolate MSFT's adjusted close price
msft_price = adj_close_df['MSFT']
# Calculate 50-day and 200-day SMAs
sma_50 = msft_price.rolling(window=50).mean()
sma_200 = msft_price.rolling(window=200).mean()
# Plot the price and the moving averages
plt.figure(figsize=(14, 7))
plt.plot(msft_price, label='MSFT Adj Close', color='skyblue', alpha=0.8)
plt.plot(sma_50, label='50-Day SMA', color='orange', linestyle='--')
plt.plot(sma_200, label='200-Day SMA', color='red', linestyle='--')
plt.title('MSFT Price and Moving Averages', fontsize=16)
plt.xlabel('Date')
plt.ylabel('Price (USD)')
plt.legend()
plt.show()
Step-by-Step Explanation:
.rolling(window=50)creates a rolling window object. It doesn’t calculate anything on its own; it just defines the window size (50 days in this case)..mean()is the aggregation function we apply to the window. So, for each day, it calculates the average price of the preceding 50 days.- We then plot the original price along with both SMAs.
Expected Output:

[Image: A line chart of MSFT’s stock price with the 50-day and 200-day moving average lines overlaid.]
The points where the short-term average (50-day) crosses the long-term average (200-day) are often interpreted by technical analysts as significant trading signals (a “Golden Cross” or “Death Cross”).
The code above calculates Simple Moving Averages. Another popular type is the Exponential Moving Average (EMA), which gives more weight to recent prices. Pandas has a method for this:
.ewm(). Can you modify the code to plot the 50-day and 200-day EMAs instead? Share your code and plot in the comments!
Conclusion and Your Next Steps as a Quant
This is a huge milestone. You have successfully completed the entire quantitative analysis workflow: from setting up your environment and learning the language to fetching, cleaning, analyzing, and visualizing real financial data. You now possess the core, practical Python skills required of any junior quant.
But this is just the beginning of your journey. Where do you go from here?
- Algorithmic Trading: Use a library like
backtraderorvectorbtto backtest the moving average crossover strategy we visualized. - Portfolio Optimization: Use the expected returns and the covariance matrix (the numerical version of our correlation matrix) to find the “optimal” portfolio allocation using Modern Portfolio Theory.
- Financial Machine Learning: Apply machine learning models from
scikit-learnto try and predict the direction of the next day’s return.
The world of quantitative finance is vast and exciting. You now have the foundational key to unlock it. Keep experimenting, keep learning, and keep coding.

