Created: March 10, 2022
Modified: July 18, 2022

tensor

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Every in machine learning talks about tensors, but no one really understands what they are. This page collects several definitions and attempts to shine light on the underlying connections.

Multidimensional array (the 'TensorFlow' definition): A tensor is an array of values indexed by some integer number of integer coordinates. For example, an array indexed by one coordinate is a vector, an array indexed by two coordinates is a matrix, and these are both special cases of 'tensor's.

Multilinear map: A tensor is a map from $V \times W \times \ldots \to \mathbb{R}$ --- where $V$ , $W$ , etc. are vector spaces --- that is linear in each of its arguments.

This definition is closely related to the previous, because any multidimensional array defines a multilinear map through the process of tensor 'contraction'. For example, a matrix $M$ defines the map

f_M(\mathbf{v}, \mathbf{w}) = \mathbf{v}^TM\mathbf{w} = \sum_{i,j} M_{ij}v_i w_j,

a three-dimensional array $(A)_{ijk}$ defines the map

f_A(\mathbf{v}, \mathbf{w}, \mathbf{x}) = \sum_{i,j,k} A_{ijk} v_i w_j x_k,

and so on. Conversely, any multilinear map can be represented as a multidimensional array for a particular choice of basis.

Elements of a tensor product space: given two vector spaces $V$ and $W$ , their tensor product $V\otimes W$ is the space of formal pairs of vectors $v \otimes w$ and their sums, up to bilinearity (multilinearity in the general case where we have $>2$ vector spaces). A tensor is then just an element of the tensor product space: a formal pair of vectors, or the sum of several such formal pairs.

How does relate to the previous viewpoints? By singular value decomposition we can always write a matrix $M$ as a sum of rank-1 matrices, i.e., as a sum of outer products:

M = \sum_{i=1}^n \sigma_i\mathbf{u}_i\mathbf{v}_j^T

It can be shown (as an exercise for the reader) that the tensor product space $V \otimes V^*$ Here $V^*$ denotes the dual space, i.e., the space of linear functionals on $V$ . is isomorphic to the space of outer products and their sums, since both are defined by taking pairs of vectors $(\mathbf{u}, \mathbf{v}^T)$ and their sums up to multilinearity. Thus, the space of bilinear maps (matrices) on $V$ is isomorphic to the tensor product space $V \otimes V^*$ .

A short aside on the notion of "rank": the number of nonzero singular values $\sigma_i$ in the above sum is the matrix rank of $M$ . The rank of a tensor in general is the minimum number of pure tensorsA pure tensor is a single formal tuple of vectors, so $u \otimes v$ is a pure tensor but $u \otimes v + w \otimes x$ is (in general) not. In the matrix example, we would equate a 'pure' tensor with a single outer product, or equivalently a rank-1 matrix. that must be summed to express it. This generalizes the notion of matrix rank.A notable distinction is that matrix rank is efficiently computable, but finding the rank of a higher-order tensor is NP-complete.

higher-order tensors:

suppose we have triples $V\otimes V^*\otimes V$ . what can we say about these?
well, the basis set has size $n^3$ , so any element (tensor) in this space is a three-dimensional array
we can define a multilinear map $f: V^* \times V \times V^* \to \mathbb{R}$ by tensor contraction: given a pure tensor $\mathbf{u} \otimes \mathbf{v}^T \otimes \mathbf{w}$ and an input tuple $(\mathbf{a}^T, \mathbf{b}, \mathbf{c}^T)$ , the result is just $f(\mathbf{a}^T, \mathbf{b}, \mathbf{c}^T) = \mathbf{a}^T\mathbf{u} \cdot \mathbf{v}^T \mathbf{b}^T\cdot \mathbf{c}^T \mathbf{w}$ ,That is, we just apply all of the functionals to their corresponding vectors and multiply the results where the extension to general tensors follows by linearity.

References:

tensor

Links to this note

tensor product

Meta