Chapter 5 - Fundamental Matrix Theory and Operations
Haiyue
25min
Chapter 5: Fundamental Matrix Theory and Operations
Learning Objectives
- Understand the definition and geometric meaning of matrices
- Master basic matrix operations (addition, scalar multiplication, multiplication)
- Understand the geometric interpretation of matrix multiplication
- Master the properties of transpose matrices
- Understand the computation rules for block matrices
Definition and Representation of Matrices
Mathematical Definition of Matrices
A matrix is a rectangular array of numbers. An matrix can be represented as:
a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix}$$ where $a_{ij}$ represents the element in row $i$ and column $j$. ```python import numpy as np import matplotlib.pyplot as plt def matrix_basics(): """ Matrix basic concepts demonstration """ print("Matrix Basic Concepts:") print("=" * 30) # Create different types of matrices A = np.array([[1, 2, 3], [4, 5, 6]]) # 2×3 matrix B = np.array([[1, 0], [0, 1]]) # 2×2 identity matrix C = np.array([[1, 2, 3]]) # 1×3 row matrix D = np.array([[1], [2], [3]]) # 3×1 column matrix matrices = { "Matrix A (2×3)": A, "Identity Matrix B (2×2)": B, "Row Matrix C (1×3)": C, "Column Matrix D (3×1)": D } for name, matrix in matrices.items(): print(f"\n{name}:") print(matrix) print(f"Shape: {matrix.shape}") print(f"Total elements: {matrix.size}") return matrices matrices = matrix_basics() ``` ### Special Matrix Types ```python def special_matrices(): """ Special matrix types """ print("\nSpecial Matrix Types:") print("=" * 25) # Zero matrix zero_matrix = np.zeros((3, 3)) print("Zero Matrix:") print(zero_matrix) # Identity matrix identity_matrix = np.eye(3) print("\nIdentity Matrix:") print(identity_matrix) # Diagonal matrix diagonal_matrix = np.diag([1, 2, 3]) print("\nDiagonal Matrix:") print(diagonal_matrix) # Upper triangular matrix upper_triangular = np.array([[1, 2, 3], [0, 4, 5], [0, 0, 6]]) print("\nUpper Triangular Matrix:") print(upper_triangular) # Lower triangular matrix lower_triangular = np.array([[1, 0, 0], [2, 3, 0], [4, 5, 6]]) print("\nLower Triangular Matrix:") print(lower_triangular) # Symmetric matrix symmetric_matrix = np.array([[1, 2, 3], [2, 4, 5], [3, 5, 6]]) print("\nSymmetric Matrix:") print(symmetric_matrix) print(f"Verify symmetry: {np.allclose(symmetric_matrix, symmetric_matrix.T)}") return { "Zero Matrix": zero_matrix, "Identity Matrix": identity_matrix, "Diagonal Matrix": diagonal_matrix, "Upper Triangular Matrix": upper_triangular, "Lower Triangular Matrix": lower_triangular, "Symmetric Matrix": symmetric_matrix } special_mats = special_matrices() ``` ## Basic Matrix Operations ### Matrix Addition Matrix addition for two matrices of the same size is defined as adding corresponding elements: $$(A + B)_{ij} = a_{ij} + b_{ij}$$ ```python def matrix_addition(): """ Matrix addition demonstration """ print("\nMatrix Addition:") print("=" * 15) A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) print("Matrix A:") print(A) print("\nMatrix B:") print(B) # Matrix addition C = A + B print("\nA + B =") print(C) # Verify addition properties print("\nVerify matrix addition properties:") # Commutativity print(f"Commutativity A + B = B + A: {np.array_equal(A + B, B + A)}") # Associativity D = np.array([[1, 1], [1, 1]]) print(f"Associativity (A + B) + D = A + (B + D): {np.array_equal((A + B) + D, A + (B + D))}") # Zero matrix effect zero = np.zeros_like(A) print(f"Zero matrix A + 0 = A: {np.array_equal(A + zero, A)}") return A, B, C matrix_addition() ``` ### Scalar Multiplication Scalar multiplication with a matrix is defined as multiplying each element by the scalar: $$(cA)_{ij} = c \cdot a_{ij}$$ ```python def scalar_multiplication(): """ Scalar multiplication demonstration """ print("\nScalar Multiplication:") print("=" * 15) A = np.array([[1, 2], [3, 4]]) scalars = [0, 1, 2, -1, 0.5] print("Original Matrix A:") print(A) for c in scalars: result = c * A print(f"\n{c} * A =") print(result) # Verify scalar multiplication properties print("\nVerify scalar multiplication properties:") c1, c2 = 2, 3 # Distributive property 1: c(A + B) = cA + cB B = np.array([[5, 6], [7, 8]]) left = c1 * (A + B) right = c1 * A + c1 * B print(f"Distributive property 1 c(A + B) = cA + cB: {np.allclose(left, right)}") # Distributive property 2: (c1 + c2)A = c1*A + c2*A left = (c1 + c2) * A right = c1 * A + c2 * A print(f"Distributive property 2 (c1 + c2)A = c1*A + c2*A: {np.allclose(left, right)}") # Associativity: c1(c2*A) = (c1*c2)A left = c1 * (c2 * A) right = (c1 * c2) * A print(f"Associativity c1(c2*A) = (c1*c2)A: {np.allclose(left, right)}") scalar_multiplication() ``` ### Matrix Multiplication Matrix multiplication is one of the most important operations in linear algebra. For an $m \times p$ matrix $A$ and a $p \times n$ matrix $B$, their product is an $m \times n$ matrix $C$, where: $$(AB)_{ij} = \sum_{k=1}^{p} a_{ik}b_{kj}$$ ```python def matrix_multiplication(): """ Detailed matrix multiplication demonstration """ print("\nMatrix Multiplication:") print("=" * 15) # Define two matrices that can be multiplied A = np.array([[1, 2, 3], [4, 5, 6]]) # 2×3 B = np.array([[7, 8], [9, 10], [11, 12]]) # 3×2 print("Matrix A (2×3):") print(A) print("\nMatrix B (3×2):") print(B) # Matrix multiplication C = A @ B # or np.dot(A, B) print(f"\nA × B (Result: 2×2):") print(C) # Manually calculate the first element for verification c11_manual = A[0, 0] * B[0, 0] + A[0, 1] * B[1, 0] + A[0, 2] * B[2, 0] print(f"\nManual calculation C[0,0]: {A[0, 0]}×{B[0, 0]} + {A[0, 1]}×{B[1, 0]} + {A[0, 2]}×{B[2, 0]} = {c11_manual}") print(f"Actual result C[0,0]: {C[0, 0]}") # Demonstrate non-commutativity of matrix multiplication print(f"\nNon-commutativity of matrix multiplication:") print(f"A shape: {A.shape}, B shape: {B.shape}") print(f"A × B shape: {C.shape}") try: BA = B @ A print(f"B × A shape: {BA.shape}") print("B × A =") print(BA) print(f"A × B ≠ B × A: {not np.array_equal(C, BA)}") except ValueError as e: print(f"B × A cannot be computed due to dimension mismatch") return A, B, C matrix_multiplication() ``` ### Geometric Interpretation of Matrix Multiplication ```python def geometric_interpretation_matrix_mult(): """ Geometric interpretation of matrix multiplication """ print("\nGeometric Interpretation of Matrix Multiplication:") print("=" * 30) # 2D transformation matrix examples # Rotation matrix theta = np.pi / 4 # 45 degrees rotation_matrix = np.array([[np.cos(theta), -np.sin(theta)], [np.sin(theta), np.cos(theta)]]) # Scaling matrix scaling_matrix = np.array([[2, 0], [0, 1.5]]) print(f"Rotation Matrix (45 degrees):") print(rotation_matrix) print(f"\nScaling Matrix (2x on x-axis, 1.5x on y-axis):") print(scaling_matrix) # Original vectors original_vectors = np.array([[1, 0, 1, 0], # x-coordinates [0, 1, 1, 1]]) # y-coordinates print(f"\nOriginal vector set:") print(original_vectors) # Apply transformations rotated = rotation_matrix @ original_vectors scaled = scaling_matrix @ original_vectors combined = scaling_matrix @ rotation_matrix @ original_vectors # Visualize transformations fig, axes = plt.subplots(2, 2, figsize=(12, 10)) # Original vectors axes[0, 0].scatter(original_vectors[0], original_vectors[1], c='blue', s=100, label='Original Points') axes[0, 0].set_xlim(-3, 3) axes[0, 0].set_ylim(-3, 3) axes[0, 0].grid(True, alpha=0.3) axes[0, 0].set_title('Original Vectors') axes[0, 0].legend() axes[0, 0].set_aspect('equal') # After rotation axes[0, 1].scatter(rotated[0], rotated[1], c='red', s=100, label='After Rotation') axes[0, 1].scatter(original_vectors[0], original_vectors[1], c='blue', s=50, alpha=0.5, label='Original') axes[0, 1].set_xlim(-3, 3) axes[0, 1].set_ylim(-3, 3) axes[0, 1].grid(True, alpha=0.3) axes[0, 1].set_title('Rotation Transformation') axes[0, 1].legend() axes[0, 1].set_aspect('equal') # After scaling axes[1, 0].scatter(scaled[0], scaled[1], c='green', s=100, label='After Scaling') axes[1, 0].scatter(original_vectors[0], original_vectors[1], c='blue', s=50, alpha=0.5, label='Original') axes[1, 0].set_xlim(-3, 3) axes[1, 0].set_ylim(-3, 3) axes[1, 0].grid(True, alpha=0.3) axes[1, 0].set_title('Scaling Transformation') axes[1, 0].legend() axes[1, 0].set_aspect('equal') # Combined transformation axes[1, 1].scatter(combined[0], combined[1], c='purple', s=100, label='Rotate then Scale') axes[1, 1].scatter(original_vectors[0], original_vectors[1], c='blue', s=50, alpha=0.5, label='Original') axes[1, 1].set_xlim(-3, 3) axes[1, 1].set_ylim(-3, 3) axes[1, 1].grid(True, alpha=0.3) axes[1, 1].set_title('Combined Transformation') axes[1, 1].legend() axes[1, 1].set_aspect('equal') plt.tight_layout() plt.show() print(f"\nTransformation results:") print(f"After rotation: {rotated}") print(f"After scaling: {scaled}") print(f"Combined transformation (rotate then scale): {combined}") geometric_interpretation_matrix_mult() ``` ## Properties of Matrix Multiplication ### Important Properties ```python def matrix_multiplication_properties(): """ Matrix multiplication properties verification """ print("\nMatrix Multiplication Properties Verification:") print("=" * 30) A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) C = np.array([[9, 10], [11, 12]]) I = np.eye(2) # Identity matrix print("Test Matrices:") print(f"A = \n{A}") print(f"B = \n{B}") print(f"C = \n{C}") print(f"I = \n{I}") # 1. Associativity: (AB)C = A(BC) left = (A @ B) @ C right = A @ (B @ C) print(f"\n1. Associativity (AB)C = A(BC): {np.allclose(left, right)}") # 2. Identity matrix property: AI = IA = A print(f"2. Identity matrix AI = A: {np.allclose(A @ I, A)}") print(f" Identity matrix IA = A: {np.allclose(I @ A, A)}") # 3. Distributivity: A(B + C) = AB + AC left = A @ (B + C) right = (A @ B) + (A @ C) print(f"3. Left distributivity A(B + C) = AB + AC: {np.allclose(left, right)}") # (B + C)A = BA + CA left = (B + C) @ A right = (B @ A) + (C @ A) print(f" Right distributivity (B + C)A = BA + CA: {np.allclose(left, right)}") # 4. Scalar multiplication compatibility: c(AB) = (cA)B = A(cB) c = 3 result1 = c * (A @ B) result2 = (c * A) @ B result3 = A @ (c * B) print(f"4. Scalar compatibility c(AB) = (cA)B = A(cB): {np.allclose(result1, result2) and np.allclose(result2, result3)}") # 5. Zero matrix property zero = np.zeros_like(A) print(f"5. Zero matrix A × 0 = 0: {np.allclose(A @ zero, zero)}") print(f" Zero matrix 0 × A = 0: {np.allclose(zero @ A, zero)}") matrix_multiplication_properties() ``` ### Computational Complexity of Matrix Multiplication ```python def matrix_multiplication_complexity(): """ Matrix multiplication computational complexity analysis """ print("\nMatrix Multiplication Computational Complexity:") print("=" * 30) import time # Test multiplication time for different matrix sizes sizes = [10, 50, 100, 200] times = [] for n in sizes: A = np.random.rand(n, n) B = np.random.rand(n, n) start_time = time.time() C = A @ B end_time = time.time() elapsed = end_time - start_time times.append(elapsed) operations = 2 * n**3 # Approximate number of operations (n^3 multiplications and n^3 additions) print(f"Matrix size {n}×{n}: Time {elapsed:.4f}s, Operations ≈ {operations:,}") # Plot complexity graph plt.figure(figsize=(10, 6)) plt.plot(sizes, times, 'bo-', label='Actual Time') # Theoretical O(n^3) curve (normalized) theoretical = [(n/sizes[0])**3 * times[0] for n in sizes] plt.plot(sizes, theoretical, 'r--', label='Theoretical O(n³)') plt.xlabel('Matrix Size n') plt.ylabel('Computation Time (seconds)') plt.title('Matrix Multiplication Time Complexity') plt.legend() plt.grid(True, alpha=0.3) plt.show() print(f"\nMatrix multiplication time complexity is O(n³)") print(f"Space complexity is O(n²)") matrix_multiplication_complexity() ``` ## Transpose Matrix ### Definition and Properties of Transpose The transpose matrix $A^T$ is obtained by interchanging the rows and columns of matrix $A$: $$(A^T)_{ij} = a_{ji}$$ ```python def matrix_transpose(): """ Matrix transpose demonstration """ print("\nMatrix Transpose:") print("=" * 15) A = np.array([[1, 2, 3], [4, 5, 6]]) print("Original Matrix A:") print(A) print(f"Shape: {A.shape}") A_T = A.T print("\nTranspose Matrix A^T:") print(A_T) print(f"Shape: {A_T.shape}") # Verify transpose properties B = np.array([[7, 8], [9, 10], [11, 12]]) print("\nTranspose Properties Verification:") # 1. (A^T)^T = A print(f"1. (A^T)^T = A: {np.array_equal((A.T).T, A)}") # 2. (A + B)^T = A^T + B^T (requires matrices of same size) C = np.array([[1, 2, 3], [4, 5, 6]]) D = np.array([[7, 8, 9], [10, 11, 12]]) left = (C + D).T right = C.T + D.T print(f"2. (A + B)^T = A^T + B^T: {np.array_equal(left, right)}") # 3. (cA)^T = cA^T c = 3 left = (c * A).T right = c * A.T print(f"3. (cA)^T = cA^T: {np.array_equal(left, right)}") # 4. (AB)^T = B^T A^T E = np.array([[1, 2], [3, 4]]) F = np.array([[5, 6], [7, 8]]) left = (E @ F).T right = F.T @ E.T print(f"4. (AB)^T = B^T A^T: {np.array_equal(left, right)}") return A, A_T matrix_transpose() ``` ### Symmetric and Antisymmetric Matrices ```python def symmetric_matrices(): """ Symmetric and antisymmetric matrices """ print("\nSymmetric and Antisymmetric Matrices:") print("=" * 35) # Symmetric matrix: A = A^T symmetric = np.array([[1, 2, 3], [2, 4, 5], [3, 5, 6]]) print("Symmetric Matrix:") print(symmetric) print(f"Verify symmetry A = A^T: {np.allclose(symmetric, symmetric.T)}") # Antisymmetric matrix: A = -A^T antisymmetric = np.array([[0, 1, -2], [-1, 0, 3], [2, -3, 0]]) print(f"\nAntisymmetric Matrix:") print(antisymmetric) print(f"Verify antisymmetry A = -A^T: {np.allclose(antisymmetric, -antisymmetric.T)}") # Any matrix can be decomposed into symmetric and antisymmetric parts A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) symmetric_part = (A + A.T) / 2 antisymmetric_part = (A - A.T) / 2 print(f"\nDecomposition of Arbitrary Matrix:") print(f"Original Matrix A:") print(A) print(f"\nSymmetric Part (A + A^T)/2:") print(symmetric_part) print(f"\nAntisymmetric Part (A - A^T)/2:") print(antisymmetric_part) print(f"\nVerify Decomposition A = Symmetric Part + Antisymmetric Part:") print(f"Reconstructed Matrix:") print(symmetric_part + antisymmetric_part) print(f"Decomposition Correct: {np.allclose(A, symmetric_part + antisymmetric_part)}") symmetric_matrices() ``` ## Block Matrices ### Concept of Block Matrices ```python def block_matrices(): """ Block matrix demonstration """ print("\nBlock Matrices:") print("=" * 15) # Create a large matrix, then partition it A = np.array([[1, 2, 5, 6], [3, 4, 7, 8], [9, 10, 13, 14], [11, 12, 15, 16]]) print("Original Matrix A (4×4):") print(A) # Partition into 2×2 submatrices A11 = A[:2, :2] A12 = A[:2, 2:] A21 = A[2:, :2] A22 = A[2:, 2:] print(f"\nBlock Matrices:") print(f"A11 = \n{A11}") print(f"A12 = \n{A12}") print(f"A21 = \n{A21}") print(f"A22 = \n{A22}") # Verify block representation reconstructed = np.block([[A11, A12], [A21, A22]]) print(f"\nReconstructed Matrix:") print(reconstructed) print(f"Reconstruction Correct: {np.array_equal(A, reconstructed)}") return A11, A12, A21, A22 block_matrices() ``` ### Block Matrix Operations ```python def block_matrix_operations(): """ Block matrix operations """ print("\nBlock Matrix Operations:") print("=" * 20) # Define two block matrices A11 = np.array([[1, 2], [3, 4]]) A12 = np.array([[5, 6], [7, 8]]) A21 = np.array([[9, 10], [11, 12]]) A22 = np.array([[13, 14], [15, 16]]) B11 = np.array([[1, 0], [0, 1]]) B12 = np.array([[2, 1], [1, 2]]) B21 = np.array([[1, 1], [1, 1]]) B22 = np.array([[3, 2], [2, 3]]) # Construct full matrices A = np.block([[A11, A12], [A21, A22]]) B = np.block([[B11, B12], [B21, B22]]) print(f"Matrix A:") print(A) print(f"\nMatrix B:") print(B) # Block matrix multiplication # (A11 A12) × (B11 B12) = (A11B11 + A12B21 A11B12 + A12B22) # (A21 A22) (B21 B22) (A21B11 + A22B21 A21B12 + A22B22) C11_block = A11 @ B11 + A12 @ B21 C12_block = A11 @ B12 + A12 @ B22 C21_block = A21 @ B11 + A22 @ B21 C22_block = A21 @ B12 + A22 @ B22 C_block = np.block([[C11_block, C12_block], [C21_block, C22_block]]) # Direct matrix multiplication C_direct = A @ B print(f"\nBlock Multiplication Result:") print(C_block) print(f"\nDirect Multiplication Result:") print(C_direct) print(f"\nResults Match: {np.allclose(C_block, C_direct)}") # Advantages of block matrices: handling large sparse matrices print(f"\nAdvantages of Block Matrices:") print(f"- Memory savings when handling large matrices") print(f"- Improved computational efficiency using sparse structure") print(f"- Friendly for parallel computation") block_matrix_operations() ``` ## Geometric Meaning of Matrices ### Matrices as Linear Transformations ```python def matrix_as_linear_transformation(): """ Geometric meaning of matrices as linear transformations """ print("\nMatrices as Linear Transformations:") print("=" * 25) # Define several common 2D linear transformations transformations = { "Identity": np.array([[1, 0], [0, 1]]), "Horizontal Flip": np.array([[-1, 0], [0, 1]]), "Vertical Flip": np.array([[1, 0], [0, -1]]), "Rotate 90°": np.array([[0, -1], [1, 0]]), "Scaling": np.array([[2, 0], [0, 0.5]]), "Shear": np.array([[1, 1], [0, 1]]), "Projection to x-axis": np.array([[1, 0], [0, 0]]) } # Original vectors (vertices of a square) original_points = np.array([[0, 1, 1, 0, 0], [0, 0, 1, 1, 0]]) fig, axes = plt.subplots(2, 4, figsize=(16, 8)) axes = axes.flatten() for i, (name, transform) in enumerate(transformations.items()): if i >= len(axes): break # Apply transformation transformed_points = transform @ original_points axes[i].plot(original_points[0], original_points[1], 'b-o', label='Original', alpha=0.7) axes[i].plot(transformed_points[0], transformed_points[1], 'r-s', label='Transformed') axes[i].set_xlim(-2.5, 2.5) axes[i].set_ylim(-2.5, 2.5) axes[i].grid(True, alpha=0.3) axes[i].set_aspect('equal') axes[i].set_title(name) axes[i].legend() print(f"{name} Matrix:") print(transform) print() plt.tight_layout() plt.show() matrix_as_linear_transformation() ``` ### Column Space and Row Space ```python def column_and_row_spaces(): """ Column space and row space of matrices """ print("\nColumn Space and Row Space of Matrices:") print("=" * 30) A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print("Matrix A:") print(A) # Column space: linear combinations of matrix columns print(f"\nColumn Vectors:") for i in range(A.shape[1]): print(f"Column {i+1}: {A[:, i]}") # Row space: linear combinations of matrix rows print(f"\nRow Vectors:") for i in range(A.shape[0]): print(f"Row {i+1}: {A[i, :]}") # Calculate matrix rank rank = np.linalg.matrix_rank(A) print(f"\nMatrix Rank: {rank}") print(f"Column Space Dimension: {rank}") print(f"Row Space Dimension: {rank}") # For this special matrix, row 3 = 2×row 2 - row 1 row3_check = 2 * A[1, :] - A[0, :] print(f"\nLinear Dependence Check:") print(f"2×Row 2 - Row 1 = {row3_check}") print(f"Row 3 = {A[2, :]}") print(f"Row 3 is Linearly Dependent: {np.allclose(row3_check, A[2, :])}") column_and_row_spaces() ``` ## Chapter Summary ```mermaid graph TD A[Fundamental Matrix Theory] --> B[Matrix Definition] A --> C[Basic Operations] A --> D[Important Properties] A --> E[Geometric Meaning] B --> F[Rectangular Array] B --> G[Special Matrices] B --> H[Matrix Representation] C --> I[Matrix Addition] C --> J[Scalar Multiplication] C --> K[Matrix Multiplication] C --> L[Transpose Operation] D --> M[Operation Laws] D --> N[Multiplication Properties] D --> O[Transpose Properties] E --> P[Linear Transformation] E --> Q[Geometric Transformation] E --> R[Space Mapping] A --> S[Block Matrices] S --> T[Block Operations] S --> U[Computational Optimization] ``` This chapter systematically covers fundamental matrix theory and operations: | Concept | Core Content | Important Properties | Application Scenarios | |---------|--------------|----------------------|----------------------| | Matrix Addition | Add corresponding elements | Commutativity, Associativity | Vector space operations | | Scalar Multiplication | Multiply each element by scalar | Distributivity, Associativity | Linear combinations | | Matrix Multiplication | Row-column inner product | Associativity, Distributivity | Composition of linear transformations | | Transpose | Interchange rows and columns | $(AB)^T = B^TA^T$ | Symmetry analysis | | Block Matrices | Submatrix operations | Block operation rules | Large-scale computation | ::: tip Key Understanding - Matrices are not just arrangements of numbers, but representations of linear transformations - Matrix multiplication corresponds to composition of linear transformations - Transpose operation reveals the duality between row space and column space - Block matrices provide an effective method for handling large matrices ::: ::: warning Important Notes - Matrix multiplication is not commutative: $AB \neq BA$ (in general) - Matrix multiplication dimension requirements: $(m \times p) \times (p \times n) = (m \times n)$ - Transpose operation order: $(AB)^T = B^TA^T$ (note the reversal of order) ::: Through this chapter, we have mastered basic matrix operation skills, laying a solid foundation for subsequent study of linear systems, determinants, linear transformations, and more. Matrix theory is a core tool in linear algebra, with wide applications in data science, machine learning, engineering computing, and other fields.