Wilcoxon Two Independent Samples Test
{Adapted from the Institute of Phonetic Sciences (IFA): http://www.fon.hum.uva.nl/}

    Note: this test is identical to the Mann-Whitney U test for two independent samples.

(EDL 7150 Class NOTE: Re-do the exercise with the Mann-Whitney U test.)

Characteristics: A most useful test to see whether the values in two samples differ in size. It resembles the Median-Test in scope, but it is much more sensitive. In fact, for large numbers it is almost as sensitive as the Two Sample Student t-test. For small numbers with unknown distributions this test is even more sensitive than the Student t-test.

Since it is only on rare occasions that we do know that values are Normal distributed, this test may be preferred over the Student t-test.

H0: The populations from which the two samples are taken have identical median values. To be complete, the two populations have identical distributions.

Assumptions: None, really.

Scale: Ordinal.

Procedure: Rank order all N = m + n values from both samples (m and n) combined. Sum the ranks of the smallest sample (Wsmallest). This value is used to determine the level of significance.

Level of Significance: Look up the level of significance in a table using Wsmallest, m and n.
Calculating the exact level of significance is based on calculating all possible permutations of ranks over both samples. This is computationally demanding if n and m are larger than 7.

Approximation: If m>10 and n>10,

Z = ( Wsmallest - 0.5 - m * ( m + n + 1 ) / 2 ) / sqrt( m * n * ( m + n + 1 ) / 12 )
is approximately Normal distributed.

(Use Wsmallest - 0.5 if Wsmallest > N*(N+1)/4, else use Wsmallest + 0.5)

Remarks: In this example, exact probabilities are calculated for m <= 10 or n <= 10. If both are larger than 7 this can take more time than is available within this system (the number of calculations grows as N!/(m!*n!) , with N!=N*(N-1)*(N-2)*...*1). Therefore, if it is anticipated that the calculations take too much time, the Normal approximation is used. However, the resulting values are unreliable and this will be indicated with a *. You are advised to check the level of significance in a table.

    Note, the symbol, !, denotes factorial. Hence, N! is read, "N factorial."
    If N is 5, say, then N! is

5 4 3 2 1,

You can compute the Sign Test by clicking HERE.

Alternatively, you could try programming it in Excel.


For m > 10 and n > 10 the Normal approximation is used.