Tutorial: Probability Transformations

In this notebook, we will work through an application of the change of variables formula for probability densities:

$p(y) = p(x)\left|\frac{dx}{dy}\right|$.

Let $\theta$ be uniformly distributed on $-0.95\frac{\pi}{2} < \theta < 0.95\frac{\pi}{2}$, and consider the function $b(\theta)=\tan(\theta)$.

1. Solve it

Given the PDF of $\theta$ and the function $b(\theta)$, what is the PDF of $b$, $p(b)$? (As simple as it is, you might want to explicitly write down $p(\theta)$ first.)

Next, define $p(b)$ as a function.

2. Check on a grid

We can also do this transformation via numerical calculation done on a grid. This defines a grid of $\theta$ values on the interval where $p(\theta)$ is non-zero for us to use.

Evaluate $p(\theta)$ at these points.

If that was done right, a tabular integration of the grid evaluations should return 1.0, or something very close.

Next, transform the gridded evaluations of $p(\theta)$ to $p(b)$ by applying the same transformation of variables as before.

Let's plot the function and grid evaluatations for sanity's sake.

We can check that the gridded values are still normalized (accounting for the fact that the grid is not evenly spaced in $b$). Note that the Riemann sum here could well be off by a few percent, as the transformed grid spacing does not provide good coverage of the tails of the function.

As even more of a sanity check, you can compare your solution to one saved below. The difference should be basically zero everywhere.

3. Check with samples

This transformation business is kind of a pain, what with the calculus and possibly ending up with non-uniform grid points. It turns out that life is much more straightforward when dealing with samples of a PDF rather than manipulation the PDF itself.

To demonstrate, generate a large number (say, $10^5$) of samples from $p(\theta)$, and straightforwardly transform them to samples of $b$ using the definition of $b(\theta)$.

We can now compute an estimate of the PDF for $b$ based on these samples, say a histogram. In the limit of many samples, they should agree very well - you can play around with changing the number of samples and histogram bins to see how that changes things.