UNISTAT - the ultimate Excel statistics add-in

6.4.1. Unpaired Samples

Data in one of the three types supported for Two Sample Tests can be used for these tests. Missing values are omitted by case.

Nonparametric Tests-Unpaired Samples

Note that Moses Extreme Reaction Test and Two Sample Median Test have a further dialogue each, which can be accessed by clicking on their [Opt] buttons situated to the left of the check boxes. If [Finish] is clicked before [Opt], then the program will use the default values suggested by the program, without displaying their further dialogues.

6.4.1.1. Mann-Whitney U Test

This test is used to determine whether two independent random samples have been drawn from the same population. The null hypothesis tested is that “the population relative frequency distributions are identical” against the alternative hypothesis that “they are different” (two-tailed test).

ranks, and U scores for the two samples as well as the test statistic, correction for ties and the asymptotic (normal and t-) and exact two-tailed probability values, with and without continuity correction.

The test statistic U for sample 1 is obtained by summing the number of times cases in sample 1 are smaller than cases in sample 2. U for sample 2 is found similarly. The smaller U value is chosen as the test statistic. A small or large U value indicates that the two samples are not similarly distributed. U values can also be calculated as:

      U1 = n1n2 + n1(n1+1)/2 - R1

      U2 = n1n2 + n2(n2+1)/2 - R2

where R1 and R2 are the sum of ranks for groups 1 and 2 respectively.

The program will compute and display a Z statistic which is corrected for ties and with no continuity correction as:

     Nonparametric Tests-Unpaired Samples

where the mean of the Mann-Whitney distribution is given as:

      Nonparametric Tests-Unpaired Samples

and its standard deviation as:

      Nonparametric Tests-Unpaired Samples

where:

      n = n1 + n2

and wi1 is the rank of the ith case belonging to group 1, supposing that group 1 has the smaller U.

The Z statistic with continuity correction is:

     Nonparametric Tests-Unpaired Samples

One- and two-tailed probabilities from normal and t-distributions (with n – 1 degrees of freedom) are displayed for Z-statistic without and with continuity correction.

Note that the following alternative definition of the standard deviation (given by Armitage & G. Berry (1994) p. 458 and Gardner & Altman (1989) Chapter 8) is not used here as it does not take ties into consideration:

      Nonparametric Tests-Unpaired Samples

An exact p‑value is also computed which is accurate for data sets with or without ties. By default, it is reported for n £ 150, though this limit can be changed by the user. To do this, the following line should be entered and edited in the [Options] section of Documents\Unistat60\Unistat60.ini file:

WMWMaxExactSize=150

This limit can be extended if there are no ties in data. However, if ties exist, the exact p‑value for n > 150 may take a long time to compute.

It is also possible to save the complete exact one-tailed cumulative probability distribution of the test statistic in its rank sum form by including the following line in the [Options] section of Unistat60.ini:

WMWSaveDist=1

By default, the distribution will be saved to the following file:

Documents\Unistat60\WMWExactDist.txt

This file name can be changed by entering and editing the following line in the [Options] section of Unistat60.ini:

WMWSaveDistFile=..\Documents\Unistat60\WMWExactDist.txt

Example 1

Example 13.3 on p. 455 from Armitage, P. & G. Berry (1994). An estimate of the median difference is required. A comparison of 32 inpatients and 32 outpatients is made.

Open NONPAR12 and select Statistics 1Nonparametric Tests (1-2 Samples) → Unpaired Samples. Select Inpatients (C16) and Outpatients (C17) as [Variable]s and check only the Mann-Whitney U Test output option to obtain the following results:

Unpaired Samples

Mann-Whitney U Test

 

 

Cases

Rank Sum

Mean Rank

U

Inpatients

 32

 858.0000

 26.8125

 694.0000

Outpatients

 32

 1222.0000

 38.1875

 330.0000

Total

 64

 2080.0000

 32.5000

 

 

Correction for Ties =

 410.5000

 

 

U

Test Statistic

1-Tail Probability

2-Tail Probability

Asymptotic Normal

 330.0000

-2.4670

 0.0068

 0.0136

Asymptotic Normal with CC

 

-2.4603

 0.0069

 0.0139

Asymptotic t

 

-2.4670

 0.0082

 0.0164

Asymptotic t with CC

 

-2.4603

 0.0083

 0.0166

Exact

 

 

 0.0065

 0.0131

 

It is concluded that the medians of the two samples are significantly different. Note that a t-test cannot detect a significant difference between the two sample means. This example shows the power of Mann-Whitney U Test when the assumption of normality fails.

Example 2

Example 8.11 on p. 164 from Zar, J. H. (2010). The null hypothesis “there is no difference between the heights of male and female students” is tested.

Open NONPAR12 and select Statistics 1Nonparametric Tests (1-2 Samples) → Unpaired Samples. Select Males (C18) and Females (C19) as [Variable]s and check only the Mann-Whitney U Test output option to obtain the following results:

Unpaired Samples

Mann-Whitney U Test

 

 

Cases

Rank Sum

Mean Rank

U

Males

 7

 60.0000

 8.5714

 3.0000

Females

 5

 18.0000

 3.6000

 32.0000

Total

 12

 78.0000

 6.5000

 

 

Correction for Ties =

 0.0000

 

 

U

Test Statistic

1-Tail Probability

2-Tail Probability

Asymptotic Normal

 3.0000

-2.3548

 0.0093

 0.0185

Asymptotic Normal with CC

 

-2.2736

 0.0115

 0.0230

Asymptotic t

 

-2.3548

 0.0191

 0.0382

Asymptotic t with CC

 

-2.2736

 0.0220

 0.0440

Exact

 

 

 0.0088

 0.0177

 

Zar reports the exact two-tailed probability as 0.018 and since this is less than 0.05, we reject the null hypothesis.

section of Documents\Unistat60\Unistat60.ini file, the exact one-tailed cumulative distribution of the rank sum is saved to the WMWExactDist.txt file as follows:

 


Rank Sum

One Tail Probability

 28

 1.26262626262626E-03

 29

 2.52525252525253E-03

 30

 5.05050505050505E-03

 31

 8.83838383838384E-03

 32

 1.51515151515152E-02

 33

 0.023989898989899

 34

 3.66161616161616E-02

 35

 0.053030303030303

 36

 7.44949494949495E-02

 37

 0.101010101010101

 38

 0.133838383838384

 39

 0.171717171717172

 40

 0.215909090909091

 41

 0.265151515151515

 42

 0.319444444444444

 43

 0.377525252525252

 44

 0.438131313131313

 45

 0.5


 

 46

 0.561868686868687

 47

 0.622474747474747

 48

 0.680555555555555

 49

 0.734848484848485

 50

 0.784090909090909

 51

 0.828282828282828

 52

 0.866161616161616

 53

 0.898989898989899

 54

 0.92550505050505

 55

 0.946969696969697

 56

 0.963383838383838

 57

 0.976010101010101

 58

 0.984848484848485

 59

 0.991161616161616

 60

 0.994949494949495

 61

 0.997474747474747

 62

 0.998737373737374

 63

 1


 

6.4.1.2. Hodges-Lehmann Estimator (Unpaired)

If the product of the two sample sizes does not exceed 2 x 109 then an estimate of the difference between the two sample medians and its confidence interval are computed.

First, all nx n2 differences between each pair of numbers from the two samples are sorted in increasing order. Then, the median (the Hodges-Lehmann estimator or the shift parameter) is found.

and standard deviation of the rank sum are displayed. Note that the mean of the rank sum is different from the mean of the Mann-Whitney statistic, whereas their standard deviations are the same.

The limits of the asymptotic confidence interval are the Kth smallest and the Kth largest difference:

      Nonparametric Tests-Unpaired Samples

where K is rounded up to the nearest integer and the mean and standard deviation of the Mann-Whitney statistic are as given in the previous section.

The exact confidence interval is also displayed, which is based on the exact distribution of the Mann-Whitney statistic. To determine the lower bound of the exact interval (the Klth smallest difference), find kl such that:

      Nonparametric Tests-Unpaired Samples

round kl up to the nearest integer and calculate:

      Nonparametric Tests-Unpaired Samples

The upper limit is determined likewise, for:

      Nonparametric Tests-Unpaired Samples.

For the paired case of this test see 6.4.2.2. Hodges-Lehmann Estimator (Paired).

Example 1

Example 13.4 on p. 458 from Armitage, P. & G. Berry (1994). Gain in weight of rats receiving diets with high and low protein content are measured. The null hypothesis “there is no difference in median weights” is tested at 95% level.

Open NONPAR12 and select Statistics 1Nonparametric Tests (1-2 Samples) → Unpaired Samples. Select High (C1) and Low (C2) as [Variable]s and check the Hodges-Lehmann Estimator (Unpaired) output option to obtain the following results:

Unpaired Samples

Hodges-Lehmann Estimator (Unpaired)

For High and Low

 

 

Minimum

Maximum

Mean

Standard Deviation

Rank Sum

 78.0000

 162.0000

 120.0000

 11.8270

 

 

K

Difference Between Medians

Lower 95%

Upper 95%

Normal Approximation

 19

 18.5000

-3.0000

 40.0000

Exact

 

 

-3.0000

 40.0000

 

6.4.1.3. Wald-Wolfowitz Runs Test

The null hypothesis “two independent samples have been drawn from the same population” is tested against the alternative hypothesis “they differ in respect of their medians, variability or skewness”. It is assumed that the variable under consideration has a continuous distribution.

All cases from the two samples are sorted together. If the two distributions are similar, then cases belonging to two samples must be scattered randomly. Then the program counts the number of runs (i.e. the number of groups of cases which belong to the same sample). If there are ties between cases belonging to two samples then the minimum and the maximum possible number of runs are reported separately. Two sets of results using the normal approximation are reported.

Asymptotic without Continuity Correction: In this case the Z-statistic is defined as:

      Nonparametric Tests-Unpaired Samples

      where:

      Nonparametric Tests-Unpaired Samples

      Nonparametric Tests-Unpaired Samples

Asymptotic with Continuity Correction: The Z-statistic with continuity correction is defined as:

      Nonparametric Tests-Unpaired Samples

      In some applications, the test statistic with continuity correction is reported for Nonparametric Tests-Unpaired Samples and without continuity correction otherwise. The same normal approximation is also used for the Runs Test.

Exact: The exact one- and two-tailed probabilities are reported. Their use is recommended for n £ 30.

Data in one of the three types supported for Two Sample Tests can be used for this test. Missing values are omitted by case.

Example

Table 100 on p. 251 from Cohen, L. & M. Holliday (1983). Aggression scores in 20 nursery school children following violent (Condition 1) and neutral (Condition 0) cartoons are given.

Open NONPAR12 and select Statistics 1Nonparametric Tests (1-2 Samples) → Unpaired Samples. Select Score (C13) as [Variable] and Condition (C14) as [Factor]. From the next dialogue uncheck the Run a separate analysis for each option selected box and select only the Wald-Wolfowitz Runs Test option:

Unpaired Samples

Wald-Wolfowitz Runs Test

Data variable: Score

Subsample selected by: Condition

 

Condition

Cases

Mean

Standard Deviation

Standard Error

0

 10

 24.2000

 19.5209

 6.1731

1

 10

 46.2000

 14.1327

 4.4692

Total

 20

 35.2000

 17.0411

 3.8105

 

 

Number of Runs

Z-Statistic

1-Tail Probability

2-Tail Probability

Asymptotic

 8

-1.3784

 0.0840

 0.1681

Asymptotic with CC

 

-1.1487

 0.1253

 0.2507

Exact

 

 

 0.1276

 

 

This result is not significant at the 10% level. Hence do not reject the null hypothesis “watching violent cartoons does not cause a significant change in the aggression of nursery school children”.

6.4.1.4. Moses Extreme Reaction Test

This test is used to determine the difference in range between two samples. Cases from the two samples are ranked together. Ranks corresponding to the smallest and largest group 1 cases are determined. The span is the difference between these two ranks plus one.

Nonparametric Tests-Unpaired Samples

The program will prompt for the number of cases to be trimmed from either side of the span. The suggested number is either 1 or the integer closest to 5% of the number of cases in group 1, whichever is larger. of cases in two groups as well as the span and the one-tailed probability.

The exact one-tailed probability is computed for n £ 150. This limit can be changed by entering the following line with the appropriate number in the [Options] section of Documents\Unistat60\Unistat60.ini  file:

WMWMaxExactSize=150

Example

Open DEMODATA and select Statistics 1Nonparametric Tests (1-2 Samples) → Unpaired Samples. Select Wages (C2) and Energy (C3) as [Variable]s and click on the [Opt] button next to the Moses Extreme Reaction Test option. Accept the default value of 3 from the next dialogue.

Unpaired Samples

Moses Extreme Reaction Test

 

Cases

Wages

 57

Energy

 57

Total

 114

 

 

span

1-Tail Probability

whole of group 1

 108

 0.0567

 3 case(s) removed from ends

 92

 0.0302

 

6.4.1.5. Two Sample Median Test

This test is used to determine whether two samples are drawn from populations with similar medians. The median for the two combined samples is calculated, the two samples are dichotomised and a 2 x 2 table is formed. It is possible to edit the computed median and to enter any values. The without and with a continuity correction and the exact probabilities.

Nonparametric Tests-Unpaired Samples

Asymptotic without Continuity Correction: The following chi-square statistic with one degree of freedom is compared with the chi-square distribution:

     Nonparametric Tests-Unpaired Samples

Asymptotic with Continuity Correction: In this case the numerator is slightly different:

     Nonparametric Tests-Unpaired Samples

where g1 and g2 are the number of cases greater than the median in samples 1 and 2 respectively.

Exact: Two-tailed and table probabilities are reported using Fisher’s exact probability formula (see 6.4.5.2. Fisher’s Exact Test).

Example

Example 8.18 on p. 156 from Zar, J. H. (1999). The null hypothesis “the medians of the two sampled populations are equal” is tested.

Open NONPAR12 and select Statistics 1Nonparametric Tests (1-2 Samples) → Unpaired Samples and select Assistant A (C20) and Assistant B (C21) as [Variable]s. Note that these are the rank data in descending order. Select the Two Sample Median Test output option to obtain the following results:

Unpaired Samples

Two Sample Median Test

 

> Median

<=Median

Total

Assistant A

 6

 5

 11

Assistant B

 6

 8

 14

Total

 12

 13

 25

 

 

Median

Chi-Square Statistic

Degrees of Freedom

Right-Tail Probability

Asymptotic

 12.5000

 0.3372

 1

 0.5615

Asymptotic with CC

 

 0.0315

 1

 0.8592

 

 

2-Tail Probability

Table Probability

Fisher's Exact

 0.6951

 0.2668

 

Since P > 0.05, do not reject the null hypothesis. Note that in the 5th edition of Biostatistical Analysis (2010) Example 8.15 on p. 173, Zar employs a different method where observations at the median are omitted. With this approach the total number of valid cases is 23 and the chi-squared statistic with continuity correction is 0.473.