6.4.1. Unpaired Samples
Data in one of the three types supported for Two Sample Tests can be used for these tests. Missing values are omitted by case.
Moses Extreme Reaction Test and Two Sample Median Test have a further dialogue each, which can be accessed by clicking on their [Opt] buttons situated to the left of the check boxes. If [Finish] is clicked before [Opt], then the program will use the default values suggested by the program, without displaying their further dialogues.
6.4.1.1. MannWhitney U Test
This test is used to determine whether two independent random samples have been drawn from the same population. The null hypothesis tested is that “the population relative frequency distributions are identical” against the alternative hypothesis that “they are different” (twotailed test).
The output includes the number of cases, rank sums, mean ranks, and U scores for the two samples as well as the test statistic, correction for ties and the asymptotic (normal and t) and exact twotailed probability values, with and without continuity correction.
The test statistic U for sample 1 is obtained by summing the number of times cases in sample 1 are smaller than cases in sample 2. U for sample 2 is found similarly. The smaller U value is chosen as the test statistic. A small or large U value indicates that the two samples are not similarly distributed. U values can also be calculated as:
U1 = n1n2 + n1(n1+1)/2 – R1
U2 = n1n2 + n2(n2+1)/2 – R2
where R1 and R2 are the sum of ranks for groups 1 and 2 respectively.
The program will compute and display a Z statistic which is corrected for ties and with no continuity correction as:
where the mean of the MannWhitney distribution is given as:
and its standard deviation as:
where:
n = n1 + n2
and w_{i1} is the rank of the i^{th} case belonging to group 1, supposing that group 1 has the smaller U.
The Z statistic with continuity correction is:
One and twotailed probabilities from normal and tdistributions (with n – 1 degrees of freedom) are displayed for Zstatistic without and with continuity correction.
The following alternative definition of the standard deviation (given by Armitage & Berry (2002) p. 276 and Gardner & Altman (2000) p. 40) is not used here as it does not take ties into consideration:
An exact pvalue is also computed which is accurate for data sets with or without ties. By default, it is reported for n £ 150, though this limit can be changed by the user. To do this, the following line should be entered and edited in the [Options] section of Documents\Unistat10\Unistat10.ini file:
WMWMaxExactSize=150
This limit can be extended if there are no ties in data. However, if ties exist, the exact pvalue for n > 150 may take a long time to compute.
It is also possible to save the complete exact onetailed cumulative probability distribution of the test statistic in its rank sum form by including the following line in the [Options] section of Unistat10.ini:
WMWSaveDist=1
By default, the distribution will be saved to the following file:
Documents\Unistat10\WMWExactDist.txt
This file name can be changed by entering and editing the following line in the [Options] section of Unistat10.ini:
WMWSaveDistFile=..\Documents\Unistat10\WMWExactDist.txt
Example 1
Example 10.3 on p. 279 from Armitage & Berry (2002). An estimate of the median difference is required. A comparison of 32 inpatients and 32 outpatients is made.
Open NONPAR12 and select Statistics 1 → Nonparametric Tests (12 Samples) → Unpaired Samples. Select Inpatients (C16) and Outpatients (C17) as [Variable]s and check only the MannWhitney U Test output option to obtain the following results:
Unpaired Samples
MannWhitney U Test

Cases 
Rank Sum 
Mean Rank 
U 
Inpatients 
32 
858.0000 
26.8125 
694.0000 
Outpatients 
32 
1222.0000 
38.1875 
330.0000 
Total 
64 
2080.0000 
32.5000 

Correction for Ties = 
410.5000 

U 
Test Statistic 
1Tail Probability 
2Tail Probability 
Asymptotic Normal 
330.0000 
2.4670 
0.0068 
0.0136 
Asymptotic Normal with CC 

2.4603 
0.0069 
0.0139 
Asymptotic t 

2.4670 
0.0082 
0.0164 
Asymptotic t with CC 

2.4603 
0.0083 
0.0166 
Exact 


0.0065 
0.0131 
It is concluded that the medians of the two samples are significantly different. A ttest cannot detect a significant difference between the two sample means. This example shows the power of MannWhitney U Test when the assumption of normality fails.
Example 2
Example 8.11 on p. 164 from Zar, J. H. (2010). The null hypothesis “there is no difference between the heights of male and female students” is tested.
Open NONPAR12 and select Statistics 1 → Nonparametric Tests (12 Samples) → Unpaired Samples. Select Males (C18) and Females (C19) as [Variable]s and check only the MannWhitney U Test output option to obtain the following results:
Unpaired Samples
MannWhitney U Test

Cases 
Rank Sum 
Mean Rank 
U 
Males 
7 
60.0000 
8.5714 
3.0000 
Females 
5 
18.0000 
3.6000 
32.0000 
Total 
12 
78.0000 
6.5000 

Correction for Ties = 
0.0000 

U 
Test Statistic 
1Tail Probability 
2Tail Probability 
Asymptotic Normal 
3.0000 
2.3548 
0.0093 
0.0185 
Asymptotic Normal with CC 

2.2736 
0.0115 
0.0230 
Asymptotic t 

2.3548 
0.0191 
0.0382 
Asymptotic t with CC 

2.2736 
0.0220 
0.0440 
Exact 


0.0088 
0.0177 
Zar reports the exact twotailed probability as 0.018 and since this is less than 0.05, we reject the null hypothesis.
If the WMWSaveDist=1 line is included in the [Options] section of Documents\Unistat10\Unistat10.ini file, the exact onetailed cumulative distribution of the rank sum is saved to the WMWExactDist.txt file as follows:
Rank Sum 
One Tail Probability 



28 
1.26262626262626E03 

46 
0.561868686868687 
29 
2.52525252525253E03 

47 
0.622474747474747 
30 
5.05050505050505E03 

48 
0.680555555555555 
31 
8.83838383838384E03 

49 
0.734848484848485 
32 
1.51515151515152E02 

50 
0.784090909090909 
33 
0.023989898989899 

51 
0.828282828282828 
34 
3.66161616161616E02 

52 
0.866161616161616 
35 
0.053030303030303 

53 
0.898989898989899 
36 
7.44949494949495E02 

54 
0.92550505050505 
37 
0.101010101010101 

55 
0.946969696969697 
38 
0.133838383838384 

56 
0.963383838383838 
39 
0.171717171717172 

57 
0.976010101010101 
40 
0.215909090909091 

58 
0.984848484848485 
41 
0.265151515151515 

59 
0.991161616161616 
42 
0.319444444444444 

60 
0.994949494949495 
43 
0.377525252525252 

61 
0.997474747474747 
44 
0.438131313131313 

62 
0.998737373737374 
45 
0.5 

63 
1 
6.4.1.2. HodgesLehmann Estimator (Unpaired)
If the product of the two sample sizes does not exceed 2 x 10^{9} then an estimate of the difference between the two sample medians and its confidence interval are computed.
First, all n1 x n2 differences between each pair of numbers from the two samples are sorted in increasing order. Then, the median (the HodgesLehmann estimator or the shift parameter) is found.
The output includes a table where the minimum, maximum, mean and standard deviation of the rank sum are displayed. The mean of the rank sum is different from the mean of the MannWhitney statistic, whereas their standard deviations are the same.
The limits of the asymptotic confidence interval are the K^{th} smallest and the K^{th} largest difference:
where K is rounded up to the nearest integer and the mean and standard deviation of the MannWhitney statistic are as given in the previous section.
The exact confidence interval is also displayed, which is based on the exact distribution of the MannWhitney statistic. To determine the lower bound of the exact interval (the K_{l}th smallest difference), find k_{l} such that:
round k_{l} up to the nearest integer and calculate:
The upper limit is determined likewise, for:
.
For the paired case of this test see 6.4.2.2. HodgesLehmann Estimator (Paired).
Example 1
Example 10.4 on p. 283 from Armitage & Berry (2002). Gain in weight of rats receiving diets with high and low protein content are measured. The null hypothesis “there is no difference in median weights” is tested at 95% level.
Open NONPAR12 and select Statistics 1 → Nonparametric Tests (12 Samples) → Unpaired Samples. Select High (C1) and Low (C2) as [Variable]s and check the HodgesLehmann Estimator (Unpaired) output option to obtain the following results:
Unpaired Samples
HodgesLehmann Estimator (Unpaired)
For High and Low

Minimum 
Maximum 
Mean 
Standard Deviation 
Rank Sum 
78.0000 
162.0000 
120.0000 
11.8270 

K 
Difference Between Medians 
Lower 95% 
Upper 95% 
Asymptotic 
19 
18.5000 
3.0000 
40.0000 
Exact 


3.0000 
40.0000 
6.4.1.3. WaldWolfowitz Runs Test
The null hypothesis “two independent samples have been drawn from the same population” is tested against the alternative hypothesis “they differ in respect of their medians, variability or skewness”. It is assumed that the variable under consideration has a continuous distribution.
All cases from the two samples are sorted together. If the two distributions are similar, then cases belonging to two samples must be scattered randomly. Then the program counts the number of runs (i.e. the number of groups of cases which belong to the same sample). If there are ties between cases belonging to two samples then the minimum and the maximum possible number of runs are reported separately. Two sets of results using the normal approximation are reported.
Asymptotic without Continuity Correction: In this case the Zstatistic is defined as:
where:
Asymptotic with Continuity Correction: The Zstatistic with continuity correction is defined as:
In some applications, the test statistic with continuity correction is reported for and without continuity correction otherwise. The same normal approximation is also used for the Runs Test.
Exact: The exact one and twotailed probabilities are reported. Their use is recommended for n £ 30.
Data in one of the three types supported for Two Sample Tests can be used for this test. Missing values are omitted by case.
Example
Table 100 on p. 251 from Cohen, L. & M. Holliday (1983). Aggression scores in 20 nursery school children following violent (Condition 1) and neutral (Condition 0) cartoons are given.
Open NONPAR12 and select Statistics 1 → Nonparametric Tests (12 Samples) → Unpaired Samples. Select Score (C13) as [Variable] and Condition (C14) as [Factor]. From the next dialogue uncheck the Run a separate analysis for each option selected box and select only the WaldWolfowitz Runs Test option:
Unpaired Samples
WaldWolfowitz Runs Test
Data variable: Score
Subsample selected by: Condition
Condition 
Cases 
Mean 
Standard Deviation 
Standard Error 
0 
10 
24.2000 
19.5209 
6.1731 
1 
10 
46.2000 
14.1327 
4.4692 
Total 
20 
35.2000 
17.0411 
3.8105 

Number of Runs 
ZStatistic 
1Tail Probability 
2Tail Probability 
Asymptotic 
8 
1.3784 
0.0840 
0.1681 
Asymptotic with CC 

1.1487 
0.1253 
0.2507 
Exact 


0.1276 

This result is not significant at the 10% level. Hence do not reject the null hypothesis “watching violent cartoons does not cause a significant change in the aggression of nursery school children”.
6.4.1.4. Moses Extreme Reaction Test
This test is used to determine the difference in range between two samples. Cases from the two samples are ranked together. Ranks corresponding to the smallest and largest group 1 cases are determined. The span is the difference between these two ranks plus one.
The program will prompt for the number of cases to be trimmed from either side of the span. The suggested number is either 1 or the integer closest to 5% of the number of cases in group 1, whichever is larger. However, this number can be changed by the user. The output includes the number of cases in two groups as well as the span and the onetailed probability.
The exact onetailed probability is computed for n £ 150. This limit can be changed by entering the following line with the appropriate number in the [Options] section of Documents\Unistat10\Unistat10.ini file:
WMWMaxExactSize=150
Example
Open DEMODATA and select Statistics 1 → Nonparametric Tests (12 Samples) → Unpaired Samples. Select Wages (C2) and Energy (C3) as [Variable]s and click on the [Opt] button next to the Moses Extreme Reaction Test option. Accept the default value of 3 from the next dialogue.
Unpaired Samples
Moses Extreme Reaction Test

Cases 
Wages 
57 
Energy 
57 
Total 
114 

span 
1Tail Probability 
whole of group 1 
108 
0.0567 
3 case(s) removed from ends 
92 
0.0302 
6.4.1.5. Two Sample Median Test
This test is used to determine whether two samples are drawn from populations with similar medians. The median for the two combined samples is calculated, the two samples are dichotomised and a 2 x 2 table is formed. It is possible to edit the computed median and to enter any values. The output includes the generated 2 x 2 table, chi square test statistics without and with a continuity correction and the exact probabilities.
Asymptotic without Continuity Correction: The following chisquare statistic with one degree of freedom is compared with the chisquare distribution:
Asymptotic with Continuity Correction: In this case the numerator is slightly different:
where g1 and g2 are the number of cases greater than the median in samples 1 and 2 respectively.
Exact: Twotailed and table probabilities are reported using Fisher’s exact probability formula (see 6.4.5.2. Fisher’s Exact Test).
Example
Example 8.18 on p. 156 from Zar, J. H. (1999). The null hypothesis “the medians of the two sampled populations are equal” is tested.
Open NONPAR12 and select Statistics 1 → Nonparametric Tests (12 Samples) → Unpaired Samples and select Assistant A (C20) and Assistant B (C21) as [Variable]s. Note that these are the rank data in descending order. Select the Two Sample Median Test output option to obtain the following results:
Unpaired Samples
Two Sample Median Test

> Median 
<=Median 
Total 
Assistant A 
6 
5 
11 
Assistant B 
6 
8 
14 
Total 
12 
13 
25 

Median 
ChiSquare Statistic 
Degrees of Freedom 
RightTail Probability 
Asymptotic 
12.5000 
0.3372 
1 
0.5615 
Asymptotic with CC 

0.0315 
1 
0.8592 

2Tail Probability 
Table Probability 
Fisher’s Exact 
0.6951 
0.2668 
Since P > 0.05, do not reject the null hypothesis. In the 5^{th} edition of Biostatistical Analysis (2010) Example 8.15 on p. 173, Zar employs a different method where observations at the median are omitted. With this approach the total number of valid cases is 23 and the chisquared statistic with continuity correction is 0.473.