So far all of our tests have been based on an assumed underlying distribution, for which we have been testing the parameter, e.g. the mean or the variance.
If we don’t have an assumed underlying distribution, we can use non-parametric tests. Typically with such tests the measure of location we are interested in is the median.
Below are summarised the types of non-parametric tests we will learn. Which we choose depends on whether we have a single sample or two samples and various conditions regarding the symmetry of the datasets and relationship between datasets. Note that all of the tests assume independent underlying data.
| Type of Test | Test | Assumptions |
| Single sample | SS1. Sign test | Underlying data are continuous |
| Single sample | SS2. Wilcoxon signed-rank test | Underlying data are continuous Underlying data are symmetric |
| Two samples | TS1. Paired sign test | Data are in matched pairs Differences between matched pairs are continuous |
| Two samples | TS2. Wilcoxon matched-pairs signed-rank test | Data are in matched pairs Differences between matched pairs are continuous Differences between matched pairs are continuous |
| Two samples | TS3. Wilcoxon rank-sum test | Two samples are independent Underlying data are symmetric |
Single Sample Test 1: Sign Test
The single-sample sign test lets us identify if the median of a set of data differs from a stated median. The median is not as a a parameter, as we do not know the underlying distribution.
To perform the test, all values greater than the stated median are replaced with a “+” and all values less than the stated median are replaced with a “-“. If the data are evenly distributed about the median, we would expect a similar number of + and – signs.
Effectively this is a special case of the binomial test. For n data points, we are using X ~ Bi(n, 0.5), where the test statistic is the number of + signs. We calculate the probability that X is above the tests statistic, below the statistic, or, for a 2-tailed test, either above or below.
Worked Example. Single-sample sign test
The following dataset is believed to come from a population with median 135:
| 150 | 130 | 125 | 140 | 170 |
| 140 | 190 | 180 | 175 | 165 |
| 160 | 130 | 140 | 140 | 145 |
Perform a single-sample sign test at the 5% significance level to test this claim.
If n is large, we can also use the normal distribution with the sign test, as follows:
Let S = min(number of + signs, number of – signs). Then E(S) = n/2 and Var(S) = n/4.
For large n (n must be > 10), T ~ N(n/2 , n/4), we can use the normal approximation of the binomial with p = 0.5. We must also use a continuity correction (as we our using a continuous distribution to approximate a discrete distribution), so our x-value is
Exercise 1


Answers to Exercise 1


Worked Solutions to Exercise 1
Single-sample Wilcoxon signed-rank Test
If the underlying data is known to be symmetric, the Wilcoxon signed-rank test is better to use as it ranks the data.
We rank each data point based on how far it is from the stated population median. The test statistic, T, is the smaller value of the sum of the negative ranks, N, and the sum of the positive ranks, P, i.e. T = min(P,N).
Although the data is continuous, the sum of ranks is discrete, and so the distribution of T is discrete. We should also note that P will be between 0 and (because the ranks go from 1 up to n). The closer the test statistic is to zero, the more spread the data is.
Worked Example. Single-sample Wilcoxon signed-rank Test
The kilogram weight of ten randomly selected mackerel are as follows: 1.6, 1.1, 2.1, 2.4, 2.2, 2.9, 2.6, 2.3, 2.7 and 1.9.
Test at the 5% significance level whether the median weight is greater than 1.8kg.
For large n, we can also approximate the Wilcoxon signed-rank test to a normal distribution. Given the statistic T = min (P,N), ,
and for large n,
.
We use a continuity correction, as we are approximating a discrete distribution with a continuous distribution. Our z-value is:
Worked Example. Single-sample Wilcoxon signed-rank Test (Normal Approximation)
In a clinical trial, the survival time, in weeks, for 19 patients with non-Hodgkin’s lymphoma are as detailed below:
| 37 | 54 | 73 | 89 | 94 | 110 | 112 | 123 | 129 | 132 |
| 148 | 151 | 173 | 189 | 201 | 204 | 213 | 276 | 281 |
Test at the 5% significance level whether the median differs from 150.
Exercise 2



Exercise 2. Answers


Two Sample Tests
The paired-sample sign test extends the idea of the sign test by looking for a positive or negative difference. In other respects it works the same as the sign test.
Worked Example – Paired-sample sign test
Data is collected on the time, in seconds, that it takes nine children to tie up their left shoelace and their right shoelace:
| Child | Left | Right |
| A | 42 | 45 |
| B | 38 | 36 |
| C | 51 | 52 |
| D | 42 | 39 |
| E | 31 | 35 |
| F | 48 | 49 |
| G | 61 | 62 |
| H | 38 | 39 |
| I | 44 | 45 |
Test at the 10% significance level whether there is a difference in the time it takes for the children to tie each shoelace.
Exercise 3



Answers to Exercise 3


Worked Solutions to Exercise 3
Wilcoxon Matched-pairs signed-rank test
If we can assume that the differences in pairs of data are symmetric, then we can use the Wilcoxon matched-pairs signed-rank tests, following the same procedure as for the single sample Wilcoxon signed-rank test, testing to see whether the paired-difference median is zero.
Worked Example: Wilcoxon matched-pairs signed-rank test
An investigation is carried out into the effectiveness of two types of post-operative pain relief drug: Drug 1 and Drug 2. Seven adults agree to take Drug 1 on one day, and Drug 2 on the second. The time, in hours, of pain relief is recorded.
| Drug 1 | Drug 2 | |
| A | 4.1 | 3.9 |
| B | 3.2 | 3.3 |
| C | 5.3 | 5.0 |
| D | 5.1 | 4.6 |
| E | 4.2 | 4.6 |
| F | 3.8 | 3.2 |
| G | 3.6 | 4.3 |
Exercise 4


Answers – Exercise 4


Wilcoxon rank-sum test
We can only use matched-pairs testing if the data is in groups of equal size and can be paired. If we have two independent datasets of different sizes and want to test for a difference between their medians we can use the Wilcoxon rank-sum test, which has a similar design to the independent t-test.
First we rank all of the data as if it were from a single population. We then take separate sums of ranks for each group. The sum of the sample with m items of data is Rm and the sum of the sample with n items of data is Rn (where m ≤ n). The test statistic (a little tricky to memorise) that we use is W = min (Rm , m(n+m+1) – Rm)
Worked Example: Wilcoxon rank-sum test
Researchers are investigating the effect of vitamin B12 on the size of the brain. A sample of males aged between 25 and 40 years is selected. Nine of them are known to have low B12 levels and seven are known to have high B12 levels. After a brain scan, the ratio of brain volume to skull capacity is recorded.
| Low B12 levels | 0.795 | 0.798 | 0.802 | 0.805 | 0.806 | 0.807 | 0.808 | 0.81 | 0.812 |
| High B12 levels | 0.786 | 0.789 | 0.792 | 0.796 | 0.799 | 0.8 | 0.803 |
Carry out a Wilcoxon rank-sum test, at the 5% significance level, to see whether the level of vitamin B12 affects the size of the brain.
Normal approximation
If m and n are large (both ≥ 10), W can be approximated as a normal distribution, with and
. We also need a continuity correction, so our z-value is
Worked Example: Wilcoxon rank-sum test – Normal Approximation
A company is investigating a new production technique to improve the quality of camera lenses for a phone. Samples of the lenses are given to a camera expert who is asked to rank the lenses, with rank 1 being the highest quality. The expert does not know which production technique has been used.
| Lens | A | B | C | D | E | F | G | H | I | J | K | L |
| Method | old | new | new | old | old | new | old | new | old | old | old | new |
| Rank | 12 | 1 | 2 | 9 | 10 | 5 | 21 | 6 | 20 | 22 | 23 | 17 |
| Lens | M | N | O | P | Q | R | S | T | U | V | W | X |
| Method | new | new | old | old | old | new | old | new | old | new | new | old |
| Rank | 14 | 13 | 3 | 4 | 19 | 11 | 24 | 16 | 18 | 8 | 7 | 15 |
Using a suitable approximation, test at the 5% significance level whether there is a difference in the quality of the production techniques.
Exercise 5



Answers to Exercise 5


End of “Non-Parametric Tests” Chapter Mixed Exercises

Worked Solutions to Exercise 5
Answers to End of “Non-Parametric Tests” Chapter Mixed Exercises

