![]() |
Brighton Webs Ltd. statistical and data services for industry |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Home Index Feedback |
Analysis of Variance - One Way A common problem in analysis is determining if blocks of data represent distinct groups or just a clumping together of items from a single population. Analysis of Variance (often reduced to ANOVA for those with a love of acronyms) helps by providing an answer to the question "is the variance between between the blocks significantly greater than the variance within them". Significantly meaning too big to be attributed to chance. Means and Sums of Squares The technique is often presented in the form of a table where the columns are, sum-of-squares, degrees of freedom and the estimated variance. There are M blocks of data containing a grand total of N values.
Once the table has been compiled, the variance ratio can be computed: A trivial data set consisting of three blocks, each containing three values has been used to ease the description of the symbols and formula in the table.
This is a spreadsheet like structure where each column contains a block of data, in the formula each item is referenced by its column and row number, thus x2,3=6, i.e. the value in column 2, row 3. Next an explanation of terms, also in table form:
The description of the calculations is based on the type of array processing that programmers are familiar with. The same results can be obtained using the table structure of spreadsheets and their built in functions such as count() and average(). These computations can now be used to populated the ANOVA table with numbers
To determine if this value is greater than that which would be expected if the variance between the blocks was due to chance alone, the variance ratio is compared to critical values derived from the F distribution. These can be obtained from tables or spreadsheet functions. For example the MS Excel function:
In this case the critical value (5%) is 5.14, so we can say that the variance between the blocks is not due to chance. Real World Example In an experiment designed to study the behavior of a small photovoltaic panel, the daily energy yield was recorded along with the sky conditions at mid-day. The sky conditions were obtained from METAR data harvested from NOAA. The table below is a fragment of the data (with some dodgy data points removed)
Pumping these numbers through a spreadsheet yields:
The critical value for F is 2.87, as the variance ratio is greater than this value we can confidently say that the descriptions of cloud cover found in METAR reports are consistent with solar panel performance. Spreadsheets MS Excel has an ANOVA module which is quick and simple to use. It also has the FINV function which allows critical values of F to be obtained. The Google Docs spreadsheet does not appear to have an explicit ANOVA function, although spreadsheets in general are a convenient way of dealing with ANOVA and it does have a SumOfSquares function which simplifies some of the number crunching.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Page updated: 18-Feb-2008 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||