Bounded Power Law (BPL)¶
Overview¶
Python module to work with bounded power-law (BPL) distributed random variables.
This module contains a few basic functions that help analyse power-law distributed random variables that can be either bounded only from below or from both above and below. It provides the following functions:
- Generate random samples from power-law distributed random variables.
- Estimate a logarithimically binned histogram for a power-law distributed random variable.
- Estimate the probability density function and the cumulative distribution function for specified power-law exponent.
Introduction to BPL¶
The probability density function (PDF) of a power-law variable is at best bounded from below to avoid divergence of the density near zero. In this case, the power-law distribution is defined on the support \((x_{min}, {\infty})\).Moreover, some real-world systems may have a power-law regime in which case the distribution is bounded from both above and below, and will have the support \((x_{min}, x_{max})\).
Assuming a continuous random variable distributed according to a power law
with exponent \(\alpha > 1\), the PDFs for the two cases above are
estimated by following the steps in [1], using which the cumulative
distribution function (CDF) for the two cases are also easily estimated. The
functional forms of the PDFs and the CDFs are given below, and are also
implemented in this module as pdf()
and cdf()
.
In order to generate random samples for a power-law distributed random
variable, this module uses the inverse transform method [2] which relies on the
inverse function of the CDF of the random variable under consideration. The
inverse funcational form of the CDfs of the two power-law distribution types
considered here are also listed below. The random sampling routine is
implemented in sample()
.
Formulae¶
1. Power-law distributed on \((x_{min}, \infty)\):¶
PDF:
\[p(x) = - \frac{\beta}{x_{min}^{\beta}} x^{-\alpha}\]CDF:
\[P(x) = - \frac{x_{min}^{\beta} - x^{\beta}}{x_{min}^{\beta}}\]Inverse of CDF:
\[P^{-1}(x) = - x_{min} (1 - x)^{1 / \beta}\]
2. Power-law distributed on \((x_{min}, x_{max})\)¶
PDF:
\[p(x) = \frac{\beta}{x_{max}^{\beta} - x_{min}^{\beta}}x^{-\alpha}\]CDF:
\[P(x) = \frac{x^{\beta} - x_{min}^{\beta}} {x_{max}^{\beta} - x_{min}^{\beta}}.\]Inverse of CDF:
\[ \begin{align}\begin{aligned}P^{-1}(x) &= (a + bx) ^ {1 / \beta}\\a &= x_{min}^{\beta}\\b &= x_{max}^{\beta} - a\end{aligned}\end{align} \]
where we have defined \(\beta = 1 - \alpha\) for a more concise notation. Note that the inverse transform method used here recommended by Clauset et al. in Sec. II B of [1] in order to generate random samples according the given power-law exponent. Also, note that they provide a separate set of formulae for discrete power-law variables which are not considered here. Thus, care should be taken before extending the results of code implemented in this module to discrete power-law distributed random variables.
References
[1] | (1, 2) Clauset, A., Shalizi, C. R., Newman, M. E. J. “Power-law Distributions in Empirical Data”. SFI Working Paper: 2007-12-049 (2007). http://www.santafe.edu/media/workingpapers/07-12-049.pdf |
[2] | “Inverse transform sampling”. Wikipedia. Accessed on: 23 May, 2016. https://en.wikipedia.org/wiki/Inverse_transform_sampling |
Functions¶
-
bpl.
pdf
(x, alpha, xmin, xmax=None)[source]¶ Returns the PDF of a (bounded) power-law variable.
Parameters: - x (numpy.ndarray of 1-dimension) – array of values on which the power-law PDF is to be evaluated
- alpha (scalar, float, greater than 1) – power-law exponent
- xmin (scalar, float) – lower bound of the power-law random variable (greater than zero)
- xmax (scalar, float, optional) – upper bound of the power-law random variable
Returns: pdf – array of the probability densities evaluated for each point in
x
.Return type: numpy.ndarray with
shape = (len(x), )
-
bpl.
cdf
(x, alpha, xmin, xmax=None)[source]¶ Returns the CDF of a (bounded) power-law variable.
Parameters: - x (numpy.ndarray of 1-dimension) – array of values on which the power-law CDF is to be evaluated
- alpha (scalar, float, greater than 1) – power-law exponent
- xmin (scalar, float) – lower bound of the power-law random variable (greater than zero)
- xmax (scalar, float, optional) – upper bound of the power-law random variable
Returns: cdf – array of the cumulative probabilities evaluated for each point in
x
.Return type: numpy.ndarray with
shape = (len(x), )
-
bpl.
sample
(alpha=2.5, size=1000, xmin=1, xmax=None)[source]¶ Generates a random sample from a (bounded) power-law distribution.
Parameters: - alpha (scalar, float, greater than 1) – power-law exponent
- size (scalar, integer) – size (i.e. length) of the random sample to be generated
- xmin (scalar, float) – lower bound of the power-law random variable (greater than zero)
- xmax (scalar, float, optional) – upper bound of the power-law random variable
Returns: s – random deviates from a power-law distribution with exponent
alpha
and with upper and lower bounds as specifiedReturn type: numpy.ndarray with
shape = (size, )
See also
-
bpl.
histogram
(arr, bins=None, plot=False, ax=None, **kwargs)[source]¶ Plots a power law histogram using logarithmic binning.
Parameters: - arr (numpy.ndarray of 1-dimension) – sample of a power-law distributed random deviates
- bins (numpy.ndarray of 1-dimension, optional) – bins for which the histogram counts are obtained. This should be of
shape = (len(arr) + 1, )
. If not provided, an array of logarithmically spaced bins is estimated using Sturges’ formula [3]. - plot (boolean) – If
True
, the histogram results are plotted on given axes or on a newly createdmatplotlib.axes
object. Default= False
. - ax (matplotlib.axes, optional) – axes on which the estimated histogram is to be plotted. If not
provided, a standard
matplotlib.axes
object is created for plotting.
Returns: - hist (numpy.ndarray of 1 dimension) – array containing the histogram counts/fractions for the given random
sample. This is of length
n
, wheren
is the number of bins given by Sturges formula. - bins (numpy.ndarray of 1 dimension) – array containing the bin edges for the given random sample obtained by
constructing
n
number of logarithmically spaced bins. - ax (matplotlib.axes) – axes on which the histogram is plotted
Notes
The kwargs arguments are partly passed on to
numpy.histogram
(such as thedensity
keyword argument), and tomatplotlib.pyplot.plot
(such as themec
,mfc
, andms
keywords), and finally to also specify axes labels and their fontsizes usingxlab
andfs
keyword arguments. These could also, in principle, be done separately outside of this function by using the necessary methods of theax
object.References
[3] Sturges, H. A. (1926). “The choice of a class interval”. Journal of the American Statistical Association, 21(153), 65-66. http://www.tandfonline.com/doi/abs/10.1080/01621459.1926.10502161