Machine Learning for Quants Series with Python (Part 14)
Introduction
We have learned the theory, architecture, and optimization of Deep Learning. Now, it is time to apply these neural engines to high-stakes quantitative problems.
In this tutorial, we will explore two distinct applications. First, we will briefly discuss how Deep Learning models non-linear structures in Fixed Income (Yield Curves), moving beyond the linear limitations of PCA (from Part 6).
Second, and more importantly, we will tackle Credit Risk Modeling. Credit default prediction is notoriously difficult due to Class Imbalance (e.g., 99% of people pay their loans, 1% default). A neural network trained on this data will simply predict “No Default” 100% of the time and achieve 99% accuracy while entirely failing its business purpose. We will solve this using SMOTE (Synthetic Minority Over-sampling Technique).
Learning Objectives
By the end of this tutorial, you will be able to:
- Understand the application of Deep Learning to complex cross-sectional modeling, such as Yield Curves.
- Define Expected Loss and the critical role of Probability of Default (PD).
- Explain the algorithmic mechanics of SMOTE for handling imbalanced datasets.
- Implement SMOTE alongside a Keras Neural Network to build a highly sensitive Credit Default predictor.
Prerequisites
- Prior Knowledge: Neural Network Architecture, ROC/AUC metrics, Confusion Matrices.
- Libraries: scikit-learn, numpy, pandas, tensorflow, keras, imblearn (Imbalanced-Learn).
Core Concepts
1. DL and the Yield Curve (Beyond PCA)
In Part 6, we used Principal Component Analysis (PCA) to extract the Level, Slope, and Curvature of the Yield Curve. While brilliant, PCA is strictly linear. If short-term rates and long-term rates interact in complex, non-linear ways (especially near the Zero Lower Bound or during inversions), PCA misses the nuance. Deep Neural Networks, with their non-linear activation functions (ReLU), can capture these hidden, higher-order arbitrage relationships across bond maturities perfectly.
2. Credit Risk and Expected Loss
In banking, Risk is quantified as:
Expected Loss (EL) = Probability of Default (PD) × Loss Given Default (LGD) × Exposure at Default (EAD)
Machine Learning is primarily deployed to calculate the PD.
3. The Imbalanced Data Crisis and SMOTE
If you feed a Neural Network a dataset with 9,900 good loans and 100 bad loans, the gradient descent optimizer will quickly realize that the easiest way to minimize error is to entirely ignore the complex features and just guess “Good Loan” every time.
To force the network to learn the characteristics of a default, we use SMOTE (Synthetic Minority Over-sampling Technique).
- What it does: It creates fake, but mathematically realistic, examples of the minority class (Defaults).
- How it works: It plots all the existing Defaults in multidimensional space. It selects a Default, finds its K nearest neighbors (other Defaults), and draws a line between them. It then drops a brand new, synthetic Default data point somewhere randomly along that line.
Trainer’s Tip: Never apply SMOTE to your Test Set or Validation Set! You only synthesize data to help the model train. You must always test the model on the harsh, imbalanced reality of true market data to get an honest evaluation.
The Hands-On Practice
Let’s build a Credit Risk model. We will simulate an imbalanced dataset, apply SMOTE strictly to the training data, and then train a deep neural network to predict the Probability of Default.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, confusion_matrix
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from imblearn.over_sampling import SMOTE # pip install imbalanced-learn
# 1. Simulate Highly Imbalanced Credit Data (10,000 samples)
# 98% Good Loans (0), 2% Defaults (1)
X, y = make_classification(n_samples=10000, n_features=10, n_informative=5,
weights=[0.98, 0.02], random_state=42)
# 2. Split Data Chronologically/Randomly
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(f“Original Training Defaults: {sum(y_train == 1)} out of {len(y_train)}”)
# 3. Apply SMOTE strictly to the Training Data
smote = SMOTE(random_state=42)
X_train_smote, y_train_smote = smote.fit_resample(X_train, y_train)
print(f“SMOTE Training Defaults: {sum(y_train_smote == 1)} out of {len(y_train_smote)}”)
# 4. Scale the Data (Fit on SMOTE train, transform both)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train_smote)
X_test_scaled = scaler.transform(X_test) # Transform real test data
# 5. Build the Neural Network Classifier
model = Sequential([
Dense(32, activation=‘relu‘, input_shape=(10,)),
Dense(16, activation=‘relu‘),
Dense(1, activation=‘sigmoid’) # Sigmoid forces output between 0 and 1 (Probability)
])
# Use Binary Crossentropy for binary classification
model.compile(optimizer=‘adam‘, loss=‘binary_crossentropy‘, metrics=[‘accuracy’])
# 6. Train the Model on the SMOTE balanced data
print(“nTraining Neural Network on SMOTE data…”)
model.fit(X_train_scaled, y_train_smote, epochs=20, batch_size=64, verbose=0)
# 7. Evaluate on the REAL, Imbalanced Test Data
y_pred_prob = model.predict(X_test_scaled)
# Convert probabilities to hard classes (Threshold = 0.5)
y_pred = (y_pred_prob > 0.5).astype(int)
print(“n— Credit Default Prediction Performance —“)
print(classification_report(y_test, y_pred))
import seaborn as sns
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6,4))
sns.heatmap(cm, annot=True, fmt=‘d’, cmap=‘Blues’)
plt.title(‘Confusion Matrix: Predicting Defaults’)
plt.xlabel(‘Predicted (0=Good, 1=Default)’)
plt.ylabel(‘Actual (0=Good, 1=Default)’)
plt.show()


Check Your Work:
- Analyze the Confusion Matrix: Look at the bottom row (Actual Defaults). Thanks to SMOTE, your Neural Network likely caught a significant portion of them! If you run this exact code without step 3 (SMOTE), the model will likely predict 0 for everything, completely missing all actual defaults.
- Threshold Adjustment: Banks don’t use a 50% cutoff for defaults. If a loan has even a 15% probability of default, they might reject it. Change the threshold logic to y_pred = (y_pred_prob > 0.15).astype(int) and see how Recall increases (you catch more bad guys) but Precision decreases (you reject more good guys).
Conclusion
In this lesson, we bridged the gap between pure computer science and quantitative finance. We saw how Deep Learning can map the non-linear intricacies of the Yield Curve. More critically, we tackled the fundamental issue of banking data: Class Imbalance.
By utilizing SMOTE to synthesize minority class data, we provided our Neural Network’s gradient descent optimizer with a fair, balanced landscape to learn from. We then unleashed that trained model back onto the harsh reality of the real-world test set, successfully building a sensitive, modern Credit Risk engine.

