Note on Think Stats: Multiplying a Pareto random variable with x_m = 1 by a positive number - what do we get?

04 Aug 2015, by Pang Yan Han

NOTE: Having only a very basic and recently acquired formal education in Probability and Statistics, I have no idea how to write a Pareto random variable in Math notation. And I can’t seem to find it on the Internet due to my poor googling skills. So I’m going to write: \( X \sim Pareto(x_m = j, \alpha = k) \) to mean that \( X \) is a Pareto random variable with parameters \( x_m = j, \alpha = k \). If you know the actual Math notation for writing a Pareto random variable, please share it with me, I’ll be very glad to know it!!!

So I was onto Chapter 4.2 of Think Stats which featured the Pareto Distribution. Exercise 4.3 was on writing a function paretovariate to generate a Pareto random variable with parameters \( x_m, \alpha \). So it turns out that the random module in Python has a function random.paretovariate whose documentation looks like this:

Help on method paretovariate in module random:

paretovariate(self, alpha) method of random.Random instance

    Pareto distribution. alpha is the shape parameter.

And indeed the random.paretovariate does not take in a \( x_m \) parameter. The author mentioned that \( x_m = 1 \) for random.paretovariate, and that by multiplying the random variable generated by random.paretovariate by a positive number \( j \), we obtain a new Pareto random variable with \( x_m = j \) instead of \( x_m = 1 \). Writing a function called paretovariate which allows a custom \( x_m \) is one objective of Exercise 4.3. My code for it is as follows:

import random

def paretovariate(alpha, x_m):
  return x_m * random.paretovariate(alpha)

But… how do we know that the author is telling us the truth? I mean, usually the author is correct, but, we are given this transformation without proof. So I set out to prove for myself the following:

Given \( X \sim Pareto(x_m = 1, \alpha = k) \), multiplying \( X \) by \( j > 0 \) and denoting the resulting random variable as \( Y \), we have \( Y \sim Pareto(x_m = j, \alpha = k) \).

And… I must be really rusty. I forgot how to prove equivalence between 2 random variables. Eventually, I remembered that this can be done by showing that they share the same moment generating function. So first we gotta find out what the mgf of a Pareto random variable is. On the Wikipedia entry of the Pareto Distribution, we see this intimidating equation:


$$\alpha(-x_m t)^{\alpha} \Gamma(-\alpha, -x_m t) \ \ \text{for} \ \ t < 0\ $$

which I must say again, looks really intimidating. And while I do understand most of the individual symbols, I don’t know what the gamma symbol stands for (it’s probably not the gamma function since that takes in a single value) and so proving things via the mgf route didn’t seem very feasible to me.

I’d probably have given up if not for recalling that we can prove the same thing by showing that the 2 random variables share the same CDF. In the book, the CDF of the Pareto distribution is given by:


$$CDF(x) = 1 - (\frac{x}{x_m})^{-\alpha} $$

which is slightly different from that given in Wikipedia but they’re essentially the same thing. So \( X \sim Pareto(x_m = 1, \alpha = \alpha) \) has CDF \( F_X(x) = 1 - (\frac{x}{1})^{-\alpha} = 1 - x^{-\alpha} \).

Now, let \( j > 0 \) be a constant, \( X \sim Pareto(x_m = 1, \alpha = k) \) for some \( k > 0 \). Let \( Y = j * X \). We want to show that \( Y \sim Pareto(x_m = j, \alpha = k) \) by showing that \( F_Y(x) = 1 - (\frac{x}{j})^{-k} \).

Proof:


$$ \begin{align*} F_Y(x) =& \ \ \mathbb{P}(Y <= x)\\ =& \ \ \mathbb{P}(j * X <= x)\\ =& \ \ \mathbb{P}(X <= \frac{x}{j})\\ =& \ \ 1 - (\frac{x}{j})^{-k} \end{align*} $$

as desired. Hence \( Y \sim Pareto(x_m = j, \alpha = k) \), where \( X \sim Pareto(x_m = 1, \alpha = k), j > 0, Y = j * X \).

Afterthoughts

While the actual proof was pretty short, I thought that this was a rather meaningful exercise and it certainly helped me brush up my probability chops.

comments powered by Disqus