Modified: July 18, 2022
tensor product
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.The tensor product of two vector spaces (defined on the same scalar field, we'll assume ) is the vector space of formal sums of formal pairs of vectors, where the formal sums are defined bilinearly.We call the elements of this space tensors, and will write them interchangeably as or . That is, if we have and , so that the tensor space contains
then we can add elements with the same left component or same right component:
but if both components are different, the result is only a formal sum, and doesn't simplify:
We call this limited addition bilinearity. This is in contrast to the direct sum of and , in which we would allow componentwise addition. Note that a tensor product of sums behaves very similarly to an ordinary product of sums, in that it expands out in a FOIL-type operation:
We can also interchange scalar multipliers between the two sides (switching here to use to indicate the tensor product):
We can view the tensor product operator as a bilinearMultilinear in the general case. map from the direct product space into the formal space of tensor products:
Tensors that fall in the range of i.e., those that can be written as are called pure tensors. Note that the sum of pure tensors is not in general a pure tensor, e.g., we cannot in general simplify a sum to or any other pure tensor.
Multilinear maps
Tensors are the 'gatekeepers' of multilinear maps. Formally, they satisfy the universal property: any bilinear map can be written as a composition that 'goes through' the tensor product space. That is, for any such , there is a unique such that
for all vectors and . This makes tensors something analogous to a sufficient statistic for multilinear maps.
Isomorphism with linear transformations
(see also tensor for an independently written version of this argument)
Consider the tensor product of a space with its dual space, i.e. the space of linear functionals on . We will think of elements of as column vectors, and elements of as row vectors, so we can represent the application of a functional to a vector by . Note that under this notation we have .
Call an element of a pure tensor if it can be written as , i.e., as a single formal pair of functional and vector, as opposed to a sum of several such pairs (recall from above that a formal sum will not necessarily simplify to a pure element). Clearly we can write any element of the tensor space as a sum of pure elements. An obvious thing to do with a pure tensor is to apply to , yielding a real number. Generalized to an arbitrary tensor in the obvious way, by taking the sum of the results from the pure elements, this is called the evaluation map or trace map. We will see later why this is. For now, just think of the trace map as having the flavor of an "inner product", since it is literally the dot product of the vectors .
For the moment, we're going to move in the other direction and think instead about outer products. Recall that the outer product of two -dimensional column vectors , is defined as the matrix given by . One way to view the outer product is as giving a linear map from pairs of vectors in to linear transformations on . Let's explore this using the machinery of tensor products. In particular, we'll show that the tensor space is isomorphic to the space of linear transformations on , also known as the endomorphisms of , written .
To see this, we'll start by defining a linear operator corresponding to the pure tensor . Formally this map from tensors to linear operators will be called the coevaluation map. For any vector , let = . That is, we apply to , yielding a scalar, and then return the vector scaled by that quantity. Note that this is a rank-one map, since it only returns vectors in the one-dimensional subspace spanned by . It should be obvious that this is a linear operator.
Next, let's generalize to arbitrary tensors. For a tensor represented as a sum of pure tensors, we define the resulting linear operator to be the sum of the linear operators generated by the pure tensors. In particular, let be a basis for (with a corresponding basis for ); for intuition, we'll imagine with the coordinate basis . Then we can represent a tensor as the sum of pure tensors given by basis elements, i.e.,
This follows because a tensor is a formal sum of pure tensors, and each pure tensor can be decomposed into basis elements. Now consider the linear operator defined by under our construction above. For any vector , the th component of the linear operator picks out the coordinate of 's representation, and returns scaled by that coordinate, multiplied by . This is equivalent to multiplication of by the matrix containing all zeros except for its th entry, which contains . Thus we see that the linear operator defined by is just the matrix with entries . This means we can generate any matrix just by constructing the tensor with the appropriate coefficients. Thus we have a bijection between tensors and linear operators: for any tensor we can generate a matrix , and for any matrix we can recover the corresponding tensor . It is not hard to see that this bijection is actually an isomorphism. This establishes the general theorem that .
Now we can motivate the name of the trace map as given above. Consider writing a matrix in its tensor form, as a linear combination of pure basis tensors. Now apply the trace map to this tensor. All elements of the form for disappear, and we're left with just the "diagonal" elements , which evaluate to 1. So the trace map returns the sum of their coefficients, , which is exactly the trace of .
The big, overarching point of all of this machinery is to allow us to define operations on vector spaces (e.g. outer products) without needing to choose a basis.
Some other quick facts:
- In general, given two vector spaces and of dimensions and respectively, the tensor product space has dimension . This is a consequence of the isomorphism with linear operators , and thus with matrices.
- Weirdly, there are lots of zeros in a tensor product space. In particular, for any , since by bilinearity we have
The general technique to show an element is not zero is to construct a linear transformation from the tensor space to some nicer space, say . For example, we can show by applying the trace map, which is a linear map taking .
- The outer product / coevaluation map has a "lambda calculus" sort of flavor, in that it has a "bare functional" hanging off of its backend waiting to be evaluated. In general we can do a kind of "currying'': for spaces ,
- A matrix can be represented as the sum of outer products of eigenvectors, weighted by its eigenvalues:This doesn't need any of the machinery developed above to prove. It's easy to see that the linear transformation given by the right side agrees with on any eigenvector (i.e., it returns ). Since we have two linear operator that agree in their actions on a full set of basis elements, they must be the same operator.