Brighton Webs Ltd.
statistical and data services for industry
Home
Index
Feedback

Mean or Average Value

The average value is the most familiar way of summarising a set of numbers.  It is the sum of all the numbers divided by the count.   If the set of number is 2 and 4, then:

Average = (2+4)/2 = 6/2 = 3

Example

The example is loosely based on a survey of the hourly rate offered to van drivers.  The data is presented as a frequency table.

Lower Upper Count Values
5.50 5.74 1 5.60
5.75 5.99 2 5.75, 5.80
6.00 6.24 4 6.00, 6.10, 6.10,6.20
6.25 6.49 6 6.25, 6.25, 6.30, 6.40, 6.45, 6.45
6.50 6.74 4 6.50, 6.60, 6.60, 6.70
6.75 6.99 3 6.75, 6.75, 6.80
7.00 7.24 2 7.00, 7.10
7.25 7.49 1 7.25
7.50 7.74 0  
7.75 7.74 0  
8.00 8.24 1 8.10
8.25 8.49 2 8.25, 8.30
8.50 8.74 1 8.60
8.75 8.99 0  
9.00 9.25 1 9.00
Totals 28 189.95

The frequency table can be used create a bar chart (some times called a histogram)

These numbers are are typical of real-world data which contain information, but requires some interpretation before a valid summary can be presented.

The mean is calculated by dividing the sum of the values by the number of the values i.e.

Mean = 189.95/28 = 6.78

Often data is not presented as a set of values, but as a frequency table.  Follow the link for a sample calculation of the mean based on a frequency table.

Formula

The average is often referred to as the mean and the formula used in maths and stats books looks like this.

The ∑ sign is the maths symbol for summation.

What does the average value tell us

The sample data was selected to demonstrate that the average value can be misleading.  The average hourly rate is 6.80 (rounded value).  Take a look at the frequency diagram, you can see that there are two clusters of data.  The largest cluster in terms of number of values consists of 23 values centered on 6.40.  The smaller cluster is five values all greater than 7.50.  Combine this observation  with the fact that 50% of drivers earn between 6.25 and 6.75, then value average value of 6.80 does provide a useful summary of van driver's wages because only 3 out of 28 drivers are getting a comparable wage.

This argument can be developed in two ways.  The first is that there are additional ways of summarizing data which can give a better understanding of a set of numbers, these include the mode, median and quartiles and where appropriate measures of dispersion such as standard deviation and skewness.

Secondly, it might be that dataset is not appropriate to the analysis.  Whilst all the jobs have the title van driver, it could be that five highest paid jobs are really something else, such as order taking (e.g. charming up the next order from the customer), technical support (e.g. delivering an item and showing the customer how to use it), installation (e.g. installing the item on the customer's site).  Depending on the purpose of the analysis, this group of five should be removed from the sample, or treated separately.

Politicians and journalists are usually the greatest abusers of the average, simply because its a single number.  They often have a limited knowledge of the underlying data  and undeveloped numerical skills.

Spreadsheets

Spreadsheets are the tool used by most people to work with data.  Both Microsoft's Excel and Google's Docs average function looks like this:

=average(arg1,arg2,arg3,.....)

The arguments can be either ranges or valid numbers, e.g.

=average(A1:A10)

=average(A1:A10,B1:B10)

=average(A1:A10,7.5)

Empty cells are ignored.

Page updated: 07-Nov-2007

 

For more information: info@brighton-webs.co.uk