Created: November 12, 2013
Modified: March 16, 2022

matrix notation

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Notation for Matrix Multiplication

Let $A = (a_{ij})$ and $B = (b_{jk})$ . Then

(AB)_{ik} = \sum_j a_{ij}b_{jk}

just by the definition of matrix multiplication (the summation over $j$ is performing the dot product of the $i$ th row of $A$ with the $k$ th column of $B$ ). Furthermore, if we have $C = (c_{kl})$ then

\begin{align} (ABC)_{il} &= \sum_k (AB)_{ik} C_{kl}\\ &= \sum_k \left(\sum_j a_{ij}b_{jk} \right) C_{kl}\\ &= \sum_{j,k} a_{ij}b_{jk}c_{kl} \end{align}

and it's easy to see by induction how this pattern generalizes: we can write a product of matrices as a sum over the product of their entries, where the sum is taken over all of the "inner" indices.

Function Composition

Say we have matrices $A: \mathbb{R}^v \to \mathbb{R}^w$ and $B: \mathbb{R}^w \to \mathbb{R}^s$ . We can decompose

A = \sum_{ij} w_j v_i^T

B = \sum_{k\ell} s_\ell \hat{w}_k^T

for some sets of vectors $(v), (w), (\hat{w}), (s)$ that exist by the isomorphism between $Hom$ and tensor products (i.e., the vectors $(v),(w)$ correspond to the pure tensor decomposition of $A$ , and similarly for $B$ ). Then we can write the composite map $BA: \mathbb{R}^v \to \mathbb{R}^s$ as

BA = \sum_{i\ell} s_\ell \left(\sum_{jk} w_k^T w_j\right) v_i^T

where $\sum_{jk} w_k^T w_j$ is the trace of the matrix $W = \sum_{jk} w_j \hat{w}_k^T$ . That last fact follows from the general relation

x^Ty = \text{tr}(yx^T)

which holds since the $(i,j)$ th entry if $yx^T$ is $y_ix_j$ , so the sum of the diagonal is the sum over $i$ of $y_ix_i$ , which is exactly the inner product. This shows very cleanly a relationship between outer and inner products by way of the trace. We then used this to express the composition of $A,B$ in terms of the trace of an (implicit) operation on the in-between space $W$ .

Vector/Matrix Notation

It seems like a good notational convention in general to think of $^T$ as equivalent to $^*$ , i.e., to transpose a vector is to move from thinking of it as a (column) vector to thinking of it as a linear functional, expressed as a row vector. So when we write design matrices $X$ , it makes sense to think of the data points as columns, since they are explicitly vectors rather than functionals. Then $X^TX$ has the nice interpretation as dot products (generalized from ordinary vectors), and the covariance $XX^T$ similarly has an interpretation in terms of outer products.

matrix notation

Notation for Matrix Multiplication

Function Composition

Vector/Matrix Notation

Meta