probability space: Nonlinear Function
Created: August 27, 2022
Modified: August 27, 2022

probability space

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

A probability space consists of:

  1. A set Ω\Omega of outcomes aka possible worlds; these represent all the ways the world might be. This is the 'sample space'.
  2. A set of events F\mathcal{F}, where each event AFA\in\mathcal{F} is some subset of possible worlds. This is sometimes called the 'information set' since it encodes any limits on the information available for us to distinguish between possible worlds. This set must satisfy some technical conditions, namely that F\mathcal{F} is a sigma-algebra on the sample space Ω\Omega:
    1. The information set must contain the sample space: ΩF\Omega \in \mathcal{F}.
    2. It must be closed under complements: for any event AFA \in \mathcal{F}, F\mathcal{F} must also contain the complement (Ω\A)(\Omega \backslash A).
    3. It must be closed under countable unions: for any A,BFA, B\in \mathcal{F}, we also have (AB)F(A \cup B) \in \mathcal{F}.
  3. A probability function P(A)\mathbb{P}(A) that assigns values in [0,1][0, 1] to each event AFA\in\mathcal{F}. These must be countably additive: the probability of the union of two disjoint events must equal the sum of their individual probabilities. Furthermore, we must have P(Ω)=1\mathbb{P}(\Omega) = 1, i.e., the total probability of all outcomes is 1.

Intuitively we would usually take F=2Ω\mathcal{F} = 2^\Omega to include all elementary outcomes ωΩ\omega \in \Omega, i.e., to assign a probability to each possible world, and thus (by countable additivity) to any collection of possible worlds. This is called a complete probability space.

The general reason to choose a different (incomplete) F\mathcal{F} is to model incomplete information. For example, the outcomes in Ω\Omega might correspond to realizations of a stochastic process. In reality, we only observe the values of the process up to the current time tt, so at time tt it makes sense to work with Ft\mathcal{F}_t in which the events are equivalence classes of realizations ignoring any differences at times >t> t. An increasing sequence of such information sets Ft\mathcal{F}_t is called a filtration.