Brighton Webs Ltd.
Data & Analysis Services for Industry & Education

Home
Index
Feedback

Pareto Distribution

The Pareto distribution is a highly left skewed distribution defined in terms of the mode and a shape factor.  It is a heavy tailed distribution meaning that a random variable following a Pareto distribution can have extreme values.

The Pareto distribution was originally developed to describe the distribution of income, the basis being that a high proportion of a population have low income, whilst only a few people have very high incomes.

The Pareto distribution is often referred to in the context of the "80/20 Rule" which is describes a range of situations, in customer support it an mean that 80% of problems come from 20% of customers or in economics where 80% of the wealth is in the hands of 20% of the population.

Applications of the Pareto distribution include insurance where it is used to model claims where the minimum claim is also the modal value, but where there is no set maximum.  In climatology it used to describe the occurrence of extreme weather.  The Pareto distribution has been proposed a model for the oil and gas discoveries in mature provinces where the minimum size is set by the economics of production.  The example shown below is for the duration of session times on a web site.

Profile

Parameters

Parameter Description Characteristics
mode Modal value which is also the minimum value. A float > 0
shape A parameter which determines the concentration of data towards the mode. A float > 0

Range

The range of random numbers generated from the Pareto distribution is from the mode to infinity.

Functions

Properties

The variance (and hence the standard deviation), skewness and kurtosis are not defined for all values of the shape parameter.

Example

The example shows the distribution of session times on a web site.  The minimum session is approximately 1 second while the connection is established and the first page delivered, the maximum length of the session is dependent on the degree of interest the visitor has in the material.

Random Number Generation

Random number generation (referred to as r) for a Pareto distribution can be performed by transforming a continuous uniform variable in the range 0 to 1 (referred to as u) with the distribution's inverse probability function:

r=g(u)

Using Basic style code, the function would be similar to:

r=mode*(1-u)^(-1/shape)

In C style code it would be:

r=mode*pow(1-u,-1/shape)

Due to the highly skewed nature of the Pareto distribution, the values of the moment based descriptors calculated from a sample of random numbers (e.g. variance, skewness and kurtosis) may be unstable.

Parameter Estimation

As the mean and standard deviation of the Pareto distribution are not defined for all values of shape factor, moment based estimation presents problems, thus the maximum likelihood equations are shown below:

Page Updated: 05-Nov-2004

 

For more information: info@brighton-webs.co.uk