Thursday, December 1, 2022

Understanding Convolutions in Probability

Callout: Before even beginning with the rest of this post I need to give special mention to Steven W. Smith's absolutely phenomenal Guide to Digital Signal Processing. This is the book that really made convolutions click for me and was a huge inspiration in writing this post. As an homage I have tried to make all my charts in a similar style to the ones he uses in his book. Now let's continue!

At a high level, in probability, a convolution is the way to determine the distribution of the sum of two random variables. That is, we can see it as a way of combine two probability distributions to create a third distribution, in much the same way we might use multiplication to combine two integers to make a third. This is very useful because, in all but a few special cases, the distribution of the sum of two random variables is not the same as the distribution that each of individual random variable.

Summing random variables

A notable and fortunate exception regarding summing two random variables is the case of normally distributed random variables. For example suppose we have these two random variables that are normally distributed:

$$X \sim \mathcal{N}(\mu_x, \sigma_x)$$

$$Y \sim \mathcal{N}(\mu_y, \sigma_y)$$

That is we have two random values sampled from two different normal distributions. It's easy to imagine a scenario when we want to know what the distribution the sum of these might be:

$$Z = X + Y$$

$$Z \sim ?$$

It turns out that there is a very pleasant analytic solution to this:

$$Z \sim \mathcal{N}(\mu_x + \mu_y, \sqrt{\sigma_x^2 + \sigma_y^2})$$

If you aren't familiar with this wonderful fact then this post has already been worth your time, as this is a very useful tool to have for many probability problems.

Notes on convolution notation.

The trouble is that not all real world probability distributions are well defined and most well defined probability distributions don't have this pleasant property. Convolutions are important because they allow us to define this operation (at least mathematically) on any probability distribution. Convolutions are particularly useful when we want to solve this problem for arbitrary discrete distributions.

We're going to adopt some notation that we'll be using throughout the post. We'll consider our two probability distributions \(f\) and \(g\), each will take argument \(t\) representing the value we want to know the probability of. Much of the work of convolutions comes out of the world of signal processing so \(t\) typically means "time", but we'll think of it more as an index.

Because we're going to make heavy use of discrete convolutions in the post we'll alternate between using \(f(t)\) when we want to think of our distribution of a function and \(f[t]\) when we want to think of it as an array. In either case this will mean "the probability of \(t\)".

The symbol used for convolutions is \(\ast\) and we typically denote the convolution of two functions as:

$$f\ast g = h$$

We'll stick to denoting the result of our convolution with \(h\). And since the result of the convolution is itself another probability distribution we can pass \(t\) into the result of convolution as well:

$$(f \ast g)(t)$$

With this we can make a generalized version of our sum of normal random variables we performed before:

$$X \sim f$$

$$P(X = t) = f(t)$$

$$Y \sim g$$

$$P(Y = t) = g(t)$$

$$Z = X + Y$$

$$Z \sim f \ast g$$

$$h = (f \ast g)$$

$$P(Z = t) = h(t) = (f \ast g)(t)$$

This latter notation will work to describe the result of any probability distribution, not just normals. Now computing those results is another issue. Let's start with our horrific example!

Building a Crab-Human Monster!

In the post we're going to explore the very common problem of constructing a horrific monstrosity which is the fusion of a crab and a human wielding a chainsaw.



from Hacker News https://ift.tt/DAXlHs3

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.