常见的矩阵求导公式及推导
1. 标量二次型求导
$$ \frac{\partial x^T A x}{\partial x} = (A + A^T)x $$推导:
$$ x^T A x = \sum_{i} \sum_{j} x_i A_{ij} x_j $$$$ \frac{\partial (x^T A x)}{\partial x_k} = \sum_{i} A_{ki} x_i + \sum_{j} A_{kj} x_j = \sum_{i} (A + A^T)_{ki} x_i $$$$ \frac{\partial x^T A x}{\partial x} = (A + A^T)x $$2. 线性项求导
$$ \frac{\partial a^T x}{\partial x} = a $$推导:
$$ a^T x = \sum_{i} a_i x_i $$$$ \frac{\partial}{\partial x_k} \sum_{i} a_i x_i = a_k $$$$ \frac{\partial a^T x}{\partial x} = a $$3. 矩阵-向量乘积求导
$$ \frac{\partial Ax}{\partial x} = A $$推导:
$$ Ax = \sum_{j} A_{ij} x_j $$$$ \frac{\partial}{\partial x_k} \sum_{j} A_{ij} x_j = A_{ik} $$$$ \frac{\partial Ax}{\partial x} = A $$4. 二次型向量偏差求导(对称矩阵)
$$ \frac{\partial (x - b)^T A (x - b)}{\partial x} = 2A(x - b) $$(假设 \( A \) 是对称矩阵)
推导:
$$ (x - b)^T A (x - b) = x^T A x - 2 b^T A x + b^T A b $$$$ 2 A x - 2 A b $$$$ \frac{\partial (x - b)^T A (x - b)}{\partial x} = 2A(x - b) $$5. 行列式求导
$$ \frac{\partial \ln |X|}{\partial X} = X^{-1} $$推导:
$$ \ln |X| = \text{tr}(\ln X) $$$$ \frac{\partial \text{tr}(\ln X)}{\partial X} = X^{-1} $$6. 迹的求导
$$ \frac{\partial \text{tr}(AX)}{\partial X} = A^T $$推导:
$$ \text{tr}(AX) = \sum_{i} \sum_{j} A_{ij} X_{ji} $$$$ \frac{\partial}{\partial X_{kl}} \sum_{i} \sum_{j} A_{ij} X_{ji} = A_{lk} $$$$ \frac{\partial \text{tr}(AX)}{\partial X} = A^T $$7. 逆矩阵求导
$$ \frac{\partial X^{-1}}{\partial X} = -X^{-1} \otimes X^{-1} $$推导:
$$ X X^{-1} = I $$$$ \frac{\partial (X X^{-1})}{\partial X} = 0 \Rightarrow X^{-1} \frac{\partial X}{\partial X} X^{-1} + \frac{\partial X^{-1}}{\partial X} X = 0 $$$$ \frac{\partial X^{-1}}{\partial X} = -X^{-1} \otimes X^{-1} $$以上是常见的矩阵求导公式及其简要推导过程。