Chapter 3: Fundamental Principles of Kalman Filtering
Chapter 3: Fundamental Principles of Kalman Filtering
- Understand the core ideas and working principles of Kalman filtering
- Master the mathematical derivation of prediction and update steps
- Understand optimality proofs and the Bayesian framework
Core Ideas
The Kalman filter is a recursive state estimation algorithm whose core idea is to obtain optimal estimates of system states through a predict-update cycle, utilizing the system’s dynamic model and observation data.
Basic Philosophy
Kalman filtering is based on the following core concepts:
- State-space representation: The system can be fully described by state vectors
- Linear Gaussian assumption: System dynamics and observations are linear, with Gaussian noise
- Bayesian update: Optimal estimation using prior information and observation data
- Recursive processing: Each step builds on the previous result, suitable for real-time processing
Mathematical Framework
State-Space Model
Kalman filtering is based on the following state-space model:
State transition equation:
Observation equation:
Where:
- : State vector at time k
- : Observation vector at time k
- : State transition matrix
- : Observation matrix
- : Control input matrix
- : Control input vector
- : Process noise
- : Observation noise
Probabilistic Representation
Algorithm Derivation
Prediction Step
Based on the posterior estimate from the previous time step, predict the current state:
State prediction:
Covariance prediction:
Update Step
Correct the prediction using observation data:
Kalman gain:
State update:
Covariance update:
Innovation Sequence
The innovation is the difference between observation and prediction:
Innovation covariance:
Derivation in the Bayesian Framework
Application of Bayes’ Theorem
Kalman filtering is essentially a special case of Bayesian estimation. For linear Gaussian systems:
Where:
- : Likelihood function
- : Prior distribution
- : Posterior distribution
Closure Property of Gaussian Distributions
Due to the properties of linear transformations and Gaussian distributions:
- Predictive distribution:
- Likelihood function:
- Posterior distribution:
Algorithm Flow
def kalman_filter_step(x_prev, P_prev, F, B, u, Q, H, R, z):
"""
Single step Kalman filter update
Parameters:
x_prev: Previous state estimate
P_prev: Previous covariance matrix
F: State transition matrix
B: Control input matrix
u: Control input
Q: Process noise covariance
H: Observation matrix
R: Observation noise covariance
z: Current observation
"""
# Prediction step
x_pred = F @ x_prev + B @ u # State prediction
P_pred = F @ P_prev @ F.T + Q # Covariance prediction
# Update step
y = z - H @ x_pred # Innovation (residual)
S = H @ P_pred @ H.T + R # Innovation covariance
K = P_pred @ H.T @ np.linalg.inv(S) # Kalman gain
x_updated = x_pred + K @ y # State update
P_updated = (np.eye(len(x_pred)) - K @ H) @ P_pred # Covariance update
return x_updated, P_updated, K, y, S
Optimality Proof
Minimum Mean Square Error Criterion
Kalman filtering is optimal in the following sense:
Under linear Gaussian assumptions, Kalman filtering provides:
- Minimum variance estimator: Minimum variance among all unbiased estimators
- Maximum likelihood estimator: Under Gaussian distribution assumption
- Minimum mean square error estimator: Minimum mean square error among all estimators
Orthogonal Projection Interpretation
From a geometric perspective, Kalman filtering is an orthogonal projection of the state vector onto the observation space:
# Geometric interpretation of orthogonal projection
def geometric_interpretation():
"""
Geometric interpretation of Kalman filtering:
State estimate = Prior estimate + Kalman gain × Innovation
This is equivalent to orthogonal projection in Hilbert space
"""
# Innovation is the part in observation space that cannot be explained by the prior
innovation = z - H @ x_pred
# Kalman gain determines how to project the innovation back to state space
kalman_gain = P_pred @ H.T @ inv(H @ P_pred @ H.T + R)
# Final estimate is the prior plus the projected innovation
x_posterior = x_pred + kalman_gain @ innovation
return x_posterior
Information-Theoretic Perspective
Information Fusion
Kalman filtering can be understood as an information fusion process:
- Prior information: Prediction from the system model
- Observation information: Measurements from sensors
- Fusion weights: Determined by their respective uncertainties
Entropy Reduction
Each observation reduces the uncertainty of the system state:
where denotes differential entropy.
Properties of the Covariance Matrix
Symmetry and Positive Definiteness
The covariance matrix has the following important properties:
- Symmetry:
- Positive semi-definiteness:
- Monotonicity: (observation always reduces uncertainty)
Numerical Stability
For numerical stability, the following form is commonly used:
def joseph_form_update(P_pred, H, K, R):
"""
Joseph form covariance update, ensures numerical stability
"""
I_KH = np.eye(len(P_pred)) - K @ H
P_updated = I_KH @ P_pred @ I_KH.T + K @ R @ K.T
return P_updated
Practical Application Considerations
Initialization
Kalman filter performance largely depends on the choice of initial state estimate and initial covariance .
Common initialization strategies:
- State initialization: Based on prior knowledge or first few observations
- Covariance initialization: Reflects initial uncertainty, typically choose large values
Filter Divergence
Filter divergence may occur when the model mismatches or parameters are poorly chosen:
- Symptoms: Covariance matrix grows, estimation error increases
- Causes: matrix too small, matrix too large, model errors
- Solutions: Parameter tuning, robustness improvements
Preview of Financial Applications
Typical applications of Kalman filtering in finance include:
- State variables: Implied volatility, risk factors, market trends
- Observation variables: Stock prices, interest rates, option prices
- Application scenarios: Parameter estimation, risk management, portfolio optimization
In the next chapter, we will learn in detail the specific implementation of linear Kalman filtering, including the programming implementation of five core equations and numerical techniques.