Chapter 9: Credit Risk Modeling in Practice
Chapter 9: Credit Risk Modeling in Practice
- Build credit rating transition matrices
- Implement Markov models for default probability
- Calculate credit risk VaR
- Perform credit portfolio risk analysis
Knowledge Summary
1. Basic Concepts of Credit Risk
Credit risk refers to the risk of loss due to a borrower or counterparty failing to fulfill contractual obligations. Within the Markov framework, we can view credit ratings as states and analyze their transition patterns.
Credit Rating State Space:
Where D represents the default state (absorbing state).
2. Credit Rating Transition Matrix
Definition: The transition matrix describes the transition probabilities from current rating to next period rating:
Properties:
- Row sums equal 1:
- Default state is absorbing: , (j ≠ D)
3. Cumulative Default Probability
t-period Cumulative Default Probability:
Calculated through matrix powers:
Where is the t-th power of the transition matrix.
4. Credit Portfolio Risk Model
Vasicek Single-Factor Model: Asset value model:
Where:
- ~ is the systematic risk factor
- ~ is the idiosyncratic risk factor
- is the asset correlation coefficient
Conditional Default Probability:
Example Code
Example 1: Building Credit Rating Transition Matrix
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
from scipy.linalg import matrix_power
import seaborn as sns
class CreditTransitionMatrix:
"""
Credit rating transition matrix model
"""
def __init__(self):
# Standard & Poor's rating system
self.rating_labels = ['AAA', 'AA', 'A', 'BBB', 'BB', 'B', 'CCC', 'D']
self.n_ratings = len(self.rating_labels)
self.transition_matrix = None
self.rating_to_index = {rating: i for i, rating in enumerate(self.rating_labels)}
def estimate_transition_matrix(self, rating_data):
"""
Estimate transition matrix from historical data
Parameters:
rating_data: DataFrame with columns ['entity_id', 'date', 'rating']
Returns:
transition_matrix: Transition probability matrix
"""
# Initialize count matrix
transition_counts = np.zeros((self.n_ratings, self.n_ratings))
# Calculate transitions grouped by entity
for entity_id in rating_data['entity_id'].unique():
entity_data = rating_data[rating_data['entity_id'] == entity_id].sort_values('date')
if len(entity_data) < 2:
continue
ratings = entity_data['rating'].values
# Count rating transitions
for t in range(len(ratings) - 1):
from_rating = ratings[t]
to_rating = ratings[t + 1]
# Convert to indices
from_idx = self.rating_to_index[from_rating]
to_idx = self.rating_to_index[to_rating]
transition_counts[from_idx, to_idx] += 1
# Convert to probability matrix (row normalization)
row_sums = transition_counts.sum(axis=1, keepdims=True)
self.transition_matrix = np.divide(
transition_counts,
row_sums,
out=np.zeros_like(transition_counts),
where=row_sums != 0
)
return self.transition_matrix
def calculate_cumulative_default_probabilities(self, horizon=10):
"""
Calculate cumulative default probabilities
Parameters:
horizon: Prediction horizon
Returns:
cumulative_pds: Cumulative default probabilities for each period
"""
if self.transition_matrix is None:
raise ValueError("Need to estimate transition matrix first")
cumulative_pds = {}
default_state_idx = self.n_ratings - 1 # Default state is the last one
for t in range(1, horizon + 1):
# Calculate t-period transition matrix
transition_t = matrix_power(self.transition_matrix, t)
# Extract probability to default state
pd_t = transition_t[:, default_state_idx]
# Store results (excluding default state itself)
cumulative_pds[t] = dict(zip(self.rating_labels[:-1], pd_t[:-1]))
return cumulative_pds
def calculate_marginal_default_probabilities(self, horizon=10):
"""
Calculate marginal default probabilities
"""
cumulative_pds = self.calculate_cumulative_default_probabilities(horizon)
marginal_pds = {}
for rating in self.rating_labels[:-1]: # Exclude default state
marginal_pds[rating] = {}
# First period marginal default probability equals cumulative
marginal_pds[rating][1] = cumulative_pds[1][rating]
# Marginal default probabilities for subsequent periods
for t in range(2, horizon + 1):
marginal_pds[rating][t] = (
cumulative_pds[t][rating] - cumulative_pds[t-1][rating]
)
return marginal_pds
def generate_synthetic_rating_data(n_entities=1000, n_periods=10, seed=42):
"""
Generate synthetic credit rating data
Parameters:
n_entities: Number of entities
n_periods: Number of time periods
seed: Random seed
Returns:
DataFrame: Data containing entity IDs, dates, and ratings
"""
np.random.seed(seed)
# True transition matrix (based on historical data approximation)
true_transition_matrix = np.array([
[0.9207, 0.0707, 0.0068, 0.0017, 0.0001, 0.0000, 0.0000, 0.0000], # AAA
[0.0264, 0.9056, 0.0597, 0.0074, 0.0006, 0.0001, 0.0001, 0.0001], # AA
[0.0027, 0.0274, 0.9105, 0.0552, 0.0031, 0.0009, 0.0001, 0.0001], # A
[0.0003, 0.0043, 0.0595, 0.8693, 0.0618, 0.0044, 0.0002, 0.0002], # BBB
[0.0001, 0.0016, 0.0061, 0.0773, 0.8053, 0.1067, 0.0022, 0.0007], # BB
[0.0000, 0.0010, 0.0023, 0.0043, 0.0648, 0.8346, 0.0884, 0.0046], # B
[0.0000, 0.0000, 0.0011, 0.0024, 0.0043, 0.1722, 0.6486, 0.1714], # CCC
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 1.0000] # D
])
rating_labels = ['AAA', 'AA', 'A', 'BBB', 'BB', 'B', 'CCC', 'D']
data = []
for entity_id in range(n_entities):
# Randomly select initial rating (biased towards higher ratings)
initial_rating_probs = [0.05, 0.15, 0.25, 0.30, 0.15, 0.08, 0.02, 0.00]
current_rating_idx = np.random.choice(len(rating_labels), p=initial_rating_probs)
for period in range(n_periods):
# Record current rating
data.append({
'entity_id': entity_id,
'date': period,
'rating': rating_labels[current_rating_idx]
})
# If already defaulted, maintain default state
if current_rating_idx == len(rating_labels) - 1: # Default state
continue
# Determine next period rating based on transition probabilities
transition_probs = true_transition_matrix[current_rating_idx]
current_rating_idx = np.random.choice(len(rating_labels), p=transition_probs)
return pd.DataFrame(data)
# Generate synthetic data
print("Generating synthetic credit rating data...")
rating_data = generate_synthetic_rating_data(n_entities=2000, n_periods=8)
print(f"Data Overview:")
print(f"Number of Entities: {rating_data['entity_id'].nunique()}")
print(f"Number of Periods: {rating_data['date'].nunique()}")
print(f"Rating Distribution:")
print(rating_data['rating'].value_counts().sort_index())
# Create transition matrix model
credit_model = CreditTransitionMatrix()
# Estimate transition matrix
transition_matrix = credit_model.estimate_transition_matrix(rating_data)
print(f"\nEstimated Credit Rating Transition Matrix:")
transition_df = pd.DataFrame(
transition_matrix,
index=credit_model.rating_labels,
columns=credit_model.rating_labels
)
print(transition_df.round(4))
# Calculate cumulative default probabilities
cumulative_pds = credit_model.calculate_cumulative_default_probabilities(horizon=5)
print(f"\nCumulative Default Probabilities (%):")
pd_df = pd.DataFrame(cumulative_pds).T * 100
print(pd_df.round(3))
Example 2: Visualization and Analysis
# Visualize transition matrix and default probabilities
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
# Subplot 1: Transition matrix heatmap
sns.heatmap(transition_matrix,
annot=True,
fmt='.3f',
xticklabels=credit_model.rating_labels,
yticklabels=credit_model.rating_labels,
cmap='Blues',
ax=axes[0, 0])
axes[0, 0].set_title('Credit Rating Transition Matrix')
axes[0, 0].set_xlabel('Target Rating')
axes[0, 0].set_ylabel('Initial Rating')
# Subplot 2: Cumulative default probability curves
for rating in ['AAA', 'A', 'BBB', 'BB', 'B']:
periods = list(range(1, 6))
pds = [cumulative_pds[t][rating] * 100 for t in periods]
axes[0, 1].plot(periods, pds, marker='o', label=rating, linewidth=2)
axes[0, 1].set_title('Cumulative Default Probability Curves')
axes[0, 1].set_xlabel('Period')
axes[0, 1].set_ylabel('Cumulative Default Probability (%)')
axes[0, 1].legend()
axes[0, 1].grid(True, alpha=0.3)
# Subplot 3: Rating distribution
rating_counts = rating_data['rating'].value_counts().reindex(credit_model.rating_labels[:-1])
axes[1, 0].bar(range(len(rating_counts)), rating_counts.values,
color='skyblue', alpha=0.7)
axes[1, 0].set_title('Rating Distribution in Data')
axes[1, 0].set_xlabel('Rating')
axes[1, 0].set_ylabel('Count')
axes[1, 0].set_xticks(range(len(rating_counts)))
axes[1, 0].set_xticklabels(rating_counts.index, rotation=45)
# Subplot 4: Term structure comparison
ratings_to_plot = ['A', 'BBB', 'BB']
periods = list(range(1, 6))
for rating in ratings_to_plot:
pds = [cumulative_pds[t][rating] * 100 for t in periods]
axes[1, 1].semilogy(periods, pds, marker='s', label=rating, linewidth=2)
axes[1, 1].set_title('Default Probability Term Structure (Log Scale)')
axes[1, 1].set_xlabel('Period')
axes[1, 1].set_ylabel('Cumulative Default Probability (%, Log Scale)')
axes[1, 1].legend()
axes[1, 1].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Calculate rating stability metrics
def calculate_rating_stability(transition_matrix, rating_labels):
"""Calculate rating stability metrics"""
stability_metrics = {}
for i, rating in enumerate(rating_labels[:-1]): # Exclude default state
# Rating persistence probability
stability_metrics[rating] = {
'persistence': transition_matrix[i, i],
'upgrade_prob': np.sum(transition_matrix[i, :i]),
'downgrade_prob': np.sum(transition_matrix[i, i+1:]),
'default_prob': transition_matrix[i, -1]
}
return stability_metrics
stability_metrics = calculate_rating_stability(transition_matrix, credit_model.rating_labels)
print(f"\nRating Stability Analysis:")
stability_df = pd.DataFrame(stability_metrics).T
print(stability_df.round(4))
Example 3: Credit Portfolio Risk Modeling
class CreditPortfolioModel:
"""
Credit portfolio risk model (based on Vasicek model)
"""
def __init__(self, correlation_matrix=None):
self.correlation_matrix = correlation_matrix
def calculate_portfolio_var(self, exposures, default_probs, lgds,
correlation=0.2, confidence=0.99, n_simulations=100000):
"""
Calculate portfolio credit risk VaR
Parameters:
exposures: Exposure array
default_probs: Default probability array
lgds: Loss given default array
correlation: Asset correlation coefficient
confidence: Confidence level
n_simulations: Number of Monte Carlo simulations
Returns:
var: Value at Risk
es: Expected Shortfall
loss_distribution: Loss distribution
"""
n_assets = len(exposures)
exposures = np.array(exposures)
default_probs = np.array(default_probs)
lgds = np.array(lgds)
# Calculate default thresholds
default_thresholds = stats.norm.ppf(default_probs)
# Monte Carlo simulation
portfolio_losses = []
for _ in range(n_simulations):
# Generate systematic risk factor
systematic_factor = np.random.normal(0, 1)
# Generate idiosyncratic risk factors
idiosyncratic_factors = np.random.normal(0, 1, n_assets)
# Calculate asset values
asset_values = (np.sqrt(correlation) * systematic_factor +
np.sqrt(1 - correlation) * idiosyncratic_factors)
# Determine defaults
defaults = asset_values < default_thresholds
# Calculate portfolio loss
portfolio_loss = np.sum(exposures * defaults * lgds)
portfolio_losses.append(portfolio_loss)
portfolio_losses = np.array(portfolio_losses)
# Calculate risk metrics
var = np.percentile(portfolio_losses, confidence * 100)
tail_losses = portfolio_losses[portfolio_losses >= var]
es = np.mean(tail_losses) if len(tail_losses) > 0 else var
expected_loss = np.mean(portfolio_losses)
return {
'var': var,
'expected_shortfall': es,
'expected_loss': expected_loss,
'loss_distribution': portfolio_losses,
'max_loss': np.max(portfolio_losses)
}
def calculate_economic_capital(self, var, expected_loss):
"""Calculate economic capital"""
return var - expected_loss
def perform_stress_testing(self, base_scenario, stress_scenarios):
"""Stress testing"""
results = {}
# Base scenario
results['base'] = self.calculate_portfolio_var(**base_scenario)
# Stress scenarios
for scenario_name, scenario_params in stress_scenarios.items():
results[scenario_name] = self.calculate_portfolio_var(**scenario_params)
return results
# Build example credit portfolio
np.random.seed(42)
# Portfolio parameters
n_assets = 100
exposures = np.random.lognormal(mean=15, sigma=1, size=n_assets) # Exposures
default_probs = np.random.beta(0.5, 20, size=n_assets) # Default probabilities
lgds = np.random.beta(2, 2, size=n_assets) * 0.6 + 0.2 # Loss given default (20%-80%)
print(f"Credit Portfolio Basic Information:")
print(f"Number of Assets: {n_assets}")
print(f"Total Exposure: {np.sum(exposures):,.0f}")
print(f"Average Default Probability: {np.mean(default_probs):.2%}")
print(f"Average Loss Given Default: {np.mean(lgds):.2%}")
# Create portfolio risk model
portfolio_model = CreditPortfolioModel()
# Base scenario
base_scenario = {
'exposures': exposures,
'default_probs': default_probs,
'lgds': lgds,
'correlation': 0.2,
'confidence': 0.99,
'n_simulations': 50000
}
# Stress scenarios
stress_scenarios = {
'high_correlation': {
**base_scenario,
'correlation': 0.5 # High correlation
},
'recession': {
**base_scenario,
'default_probs': default_probs * 2.5, # 150% increase in default probability
'correlation': 0.3
},
'severe_recession': {
**base_scenario,
'default_probs': default_probs * 4, # 300% increase in default probability
'lgds': np.minimum(lgds * 1.3, 0.9), # 30% increase in loss given default
'correlation': 0.4
}
}
# Perform risk calculation
print(f"\nPerforming credit portfolio risk analysis...")
risk_results = portfolio_model.perform_stress_testing(base_scenario, stress_scenarios)
# Display results
print(f"\nCredit Portfolio Risk Analysis Results:")
print("=" * 60)
for scenario_name, result in risk_results.items():
var = result['var']
es = result['expected_shortfall']
el = result['expected_loss']
ec = portfolio_model.calculate_economic_capital(var, el)
print(f"\n{scenario_name.upper()} Scenario:")
print(f" 99% VaR: {var:,.0f}")
print(f" Expected Loss: {el:,.0f}")
print(f" Expected Shortfall: {es:,.0f}")
print(f" Economic Capital: {ec:,.0f}")
print(f" Maximum Loss: {result['max_loss']:,.0f}")
print(f" VaR/Total Exposure: {var/np.sum(exposures):.2%}")
Example 4: Loss Distribution Visualization and Analysis
# Visualize loss distribution
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
# Subplot 1: Base scenario loss distribution
base_losses = risk_results['base']['loss_distribution']
axes[0, 0].hist(base_losses, bins=100, density=True, alpha=0.7, color='blue')
axes[0, 0].axvline(risk_results['base']['var'], color='red', linestyle='--',
label=f"99% VaR: {risk_results['base']['var']:,.0f}")
axes[0, 0].axvline(risk_results['base']['expected_loss'], color='green', linestyle='-',
label=f"Expected Loss: {risk_results['base']['expected_loss']:,.0f}")
axes[0, 0].set_title('Base Scenario Loss Distribution')
axes[0, 0].set_xlabel('Loss')
axes[0, 0].set_ylabel('Density')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)
# Subplot 2: VaR comparison across scenarios
scenarios = list(risk_results.keys())
vars = [risk_results[s]['var'] for s in scenarios]
colors = ['blue', 'orange', 'red', 'darkred']
bars = axes[0, 1].bar(scenarios, vars, color=colors, alpha=0.7)
axes[0, 1].set_title('99% VaR Across Different Scenarios')
axes[0, 1].set_ylabel('VaR')
axes[0, 1].tick_params(axis='x', rotation=45)
# Add value labels on bar chart
for bar, var in zip(bars, vars):
height = bar.get_height()
axes[0, 1].text(bar.get_x() + bar.get_width()/2., height,
f'{var:,.0f}', ha='center', va='bottom')
axes[0, 1].grid(True, alpha=0.3, axis='y')
# Subplot 3: Loss distribution comparison (log scale)
scenarios_to_plot = ['base', 'recession']
for scenario in scenarios_to_plot:
losses = risk_results[scenario]['loss_distribution']
axes[1, 0].hist(losses, bins=100, density=True, alpha=0.6,
label=f'{scenario} scenario')
axes[1, 0].set_yscale('log')
axes[1, 0].set_title('Loss Distribution Comparison (Log Scale)')
axes[1, 0].set_xlabel('Loss')
axes[1, 0].set_ylabel('Density (Log Scale)')
axes[1, 0].legend()
axes[1, 0].grid(True, alpha=0.3)
# Subplot 4: Risk decomposition
scenarios = list(risk_results.keys())
el_values = [risk_results[s]['expected_loss'] for s in scenarios]
ul_values = [risk_results[s]['var'] - risk_results[s]['expected_loss'] for s in scenarios]
width = 0.35
x = np.arange(len(scenarios))
axes[1, 1].bar(x, el_values, width, label='Expected Loss', color='lightblue', alpha=0.8)
axes[1, 1].bar(x, ul_values, width, bottom=el_values, label='Unexpected Loss',
color='lightcoral', alpha=0.8)
axes[1, 1].set_title('Risk Decomposition: Expected vs Unexpected Loss')
axes[1, 1].set_ylabel('Loss')
axes[1, 1].set_xticks(x)
axes[1, 1].set_xticklabels(scenarios, rotation=45)
axes[1, 1].legend()
axes[1, 1].grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()
# Calculate portfolio diversification benefit
def calculate_diversification_benefit(exposures, default_probs, lgds):
"""Calculate portfolio diversification benefit"""
# Individual asset independent losses
individual_vars = []
for i in range(len(exposures)):
individual_result = portfolio_model.calculate_portfolio_var(
exposures=[exposures[i]],
default_probs=[default_probs[i]],
lgds=[lgds[i]],
correlation=0, # No correlation for individual asset
confidence=0.99,
n_simulations=10000
)
individual_vars.append(individual_result['var'])
# Calculate diversification benefit
sum_individual_vars = np.sum(individual_vars)
portfolio_var = risk_results['base']['var']
diversification_benefit = sum_individual_vars - portfolio_var
diversification_ratio = diversification_benefit / sum_individual_vars
return {
'sum_individual_vars': sum_individual_vars,
'portfolio_var': portfolio_var,
'diversification_benefit': diversification_benefit,
'diversification_ratio': diversification_ratio
}
print(f"\nCalculating diversification benefit...")
div_benefit = calculate_diversification_benefit(exposures[:10], default_probs[:10], lgds[:10])
print(f"\nDiversification Benefit Analysis (Sample of first 10 assets):")
print(f"Sum of Individual VaRs: {div_benefit['sum_individual_vars']:,.0f}")
print(f"Portfolio VaR: {div_benefit['portfolio_var']:,.0f}")
print(f"Diversification Benefit: {div_benefit['diversification_benefit']:,.0f}")
print(f"Diversification Ratio: {div_benefit['diversification_ratio']:.2%}")
# Correlation sensitivity analysis
correlations = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
correlation_vars = []
print(f"\nPerforming correlation sensitivity analysis...")
for corr in correlations:
scenario = {**base_scenario, 'correlation': corr, 'n_simulations': 20000}
result = portfolio_model.calculate_portfolio_var(**scenario)
correlation_vars.append(result['var'])
# Visualize correlation sensitivity
plt.figure(figsize=(10, 6))
plt.plot(correlations, correlation_vars, 'bo-', linewidth=2, markersize=8)
plt.title('Correlation Sensitivity of Portfolio VaR')
plt.xlabel('Asset Correlation Coefficient')
plt.ylabel('99% VaR')
plt.grid(True, alpha=0.3)
# Add value labels
for i, (corr, var) in enumerate(zip(correlations, correlation_vars)):
plt.annotate(f'{var:,.0f}', (corr, var), textcoords="offset points",
xytext=(0,10), ha='center')
plt.tight_layout()
plt.show()
print(f"\nCorrelation Sensitivity Analysis Results:")
for corr, var in zip(correlations, correlation_vars):
print(f"Correlation Coefficient {corr:.1f}: VaR = {var:,.0f}")
Theoretical Analysis
Advantages of Markov Chains in Credit Risk Applications
- Natural State Representation: Credit ratings naturally form discrete state spaces
- Stability of Transition Probabilities: Rating agency methodologies are relatively stable
- Sufficiency of Historical Data: Long time series of rating historical data available
- High Regulatory Recognition: Basel Accords recognize rating-based approaches
Mathematical Foundation of Vasicek Model
Latent Variable Model:
Where is the standardized asset value of the i-th borrower.
Default Threshold:
Derivation of Conditional Default Probability Formula:
Economic Capital Calculation
Economic Capital Definition:
Where:
- is Value at Risk at confidence level α
- is Expected Loss
RAROC (Risk-Adjusted Return on Capital):
Mathematical Formula Summary
-
Transition Probability Estimation:
-
Cumulative Default Probability:
-
Marginal Default Probability:
-
Vasicek Conditional Default Probability:
-
Portfolio Loss:
-
Economic Capital:
- Transition matrices need regular updates to reflect current market conditions
- Correlation parameters have significant impact on portfolio risk, requiring careful calibration
- Should consider forward-looking rating adjustments, especially during economic cycle transitions
- Model validation should include backtesting and stress testing
- Need to consider model risk and data quality risk