6.2.1. Correlation Coefficients
Four Correlation Coefficients (Pearson product moment, Spearman rank, Kendall rank and point biserial) can be accessed under this menu item and the results presented in a single page of output.
Two or more columns can be selected by clicking on [Variable]. Correlations will be computed between all possible pairs, as long as the two columns have the same size. For each test, any pair of cases with one or more missing values is omitted and the degrees of freedom adjusted. Output Options Dialogue will allow you to choose which tests to appear in the output.
If a factor column is selected, then it is assumed that the data is not paired and only the point serial correlation is computed.
6.2.1.1. Pearson Product Moment Correlation
The aim of this correlation coefficient is to establish the degree of linear relationship between two variables. The coefficient is defined as the covariance of the two samples divided by the product of their standard deviations.
The probability value is based on Student’s tdistribution, where the tstatistic is calculated as:
This correlation coefficient is a relatively poor measure of association since it does not take into consideration the individual distributions of the two variables. The effect of outliers may be considerable. This makes it difficult to conclude that one linear correlation is significantly better than another. The nonparametric Correlation Coefficients Spearman’s rho and Kendall’s tau are more robust measures.
Pairs with one or more missing values are omitted and the degrees of freedom is adjusted. The output includes the correlation coefficient, its confidence interval, tstatistic, degrees of freedom and one and twotailed probabilities.
Example
Table 8.5 on p. 89, Gardner & Altman (2000). The null hypothesis “basal metabolic rate and total energy expenditure are not correlated” is tested at 95% confidence level.
Open CORRCOEF, select Statistics 1 → Correlation Coefficients, select Basal and Energy (C1 and C2) as [Variable]s, select all output options (including the Report summary statistics box) and click [Next] to obtain the following results:
Correlation Coefficients
For Basal and Energy

Valid Cases 
Missing 
Mean 
Standard Deviation 
Basal 
13 
0 
5.6515 
0.4650 
Energy 
13 
0 
8.0662 
1.2381 
Paired 
13 
0 



Correlation Coefficient 
Degrees of Freedom 
* Test Statistic 
1Tail Probability 
2Tail Probability 
Pearson 
0.7283 
11 
3.5249 
0.0024 
0.0048 
Spearman Rank 
0.6190 
11 
2.6139 
0.0120 
0.0241 
Kendall Rank 
0.4258 

2.0171 
0.0218 
0.0437 
Kendall Rank with CC 
0.4387 

2.0782 
0.0188 
0.0377 
Point Biserial (sample SD) 
0.7866 
24 
6.2419 
0.0000 
0.0000 
Point Biserial (pop SD) 
0.8022 
24 
6.5828 
0.0000 
0.0000 

Lower 95% 
Upper 95% 
Pearson 
0.2961 
0.9129 
Spearman Rank 
0.1032 
0.8724 
Kendall Rank 
0.1635 
0.7912 
Kendall Rank with CC 
0.1481 
0.7970 
Point Biserial (sample SD) 
0.8998 
0.5743 
Point Biserial (pop SD) 
0.9076 
0.6019 
* Zstatistic for Kendall rank, tstatistic otherwise
This result shows that there is a significant correlation between the two variables.
6.2.1.2. Spearman’s Rank Correlation
Correlation between relative rankings of the two variables is measured rather than their nominal values. In this way each variable is transformed into a uniformly distributed variable and the effect of outliers is minimised. Spearman’s correlation coefficient (also called rho) is calculated as follows:
where R is the sum of squared differences between the ranks of corresponding cases of the two variables and:
where Kx and Ky are the sum of k3 – k where k is the number of ties at a given rank within each variable. The tail probability of rho is determined by comparing the following tstatistic with the Student’s t distribution:
Pairs with at least one missing value are omitted and the degrees of freedom is adjusted. The output includes the correlation coefficient, its confidence interval, tstatistic, degrees of freedom and one and twotailed probabilities.
Example
Example 19.13 on p. 401 from Zar, J. H. (2010). The null hypothesis “there is no correlation between the ranks of values in the two variables” is tested.
Open CORRCOEF, select Statistics 1 → Correlation Coefficients. Select X and Y (C3 and C4) as [Variable]s and select only the Spearman Rank output option to obtain the following results:
Correlation Coefficients
For X and Y

Correlation Coefficient 
Degrees of Freedom 
* Test Statistic 
1Tail Probability 
2Tail Probability 
Spearman Rank 
0.8511 
10 
5.1261 
0.0002 
0.0004 

Lower 95% 
Upper 95% 
Spearman Rank 
0.5418 
0.9574 
* Zstatistic for Kendall rank, tstatistic otherwise
This result shows that there is a significant rank correlation and the null hypothesis should be rejected. Note that the denominator evaluates to 240, not 242 as in the book.
6.2.1.3. Kendall’s Rank Correlation
Like Spearman’s rho this is also a rank correlation coefficient (also called tau) and as such it has the same advantage over Pearson Product Moment Correlation. Additionally, it provides a more robust nonparametric measure by comparing the relative ordering of ranks rather than their numeric difference as in the case of Spearman’s rho. Kendall’s tau is calculated as:
where R is the number of times a case is greater than other cases in both variables summed over all cases, and Kx and Ky are the sum of k2 – k where k is the number of ties at a given rank within each variable. For tau with continuity correction R is augmented by one.
The tail probability of tau is determined from the normal distribution with a standard deviation:
where:
· Px = sum of (k2 – k)(k – 2) for X
· Py = sum of (k2 – k)(k – 2) for Y
· Qx = sum of (k2 – k)(2k + 5) for X
· Qy = sum of (k2 – k)(2k + 5) for Y
· J = n2 – n.
Pairs with at least one missing value are omitted and the degrees of freedom is adjusted. The output includes the correlation coefficient, its confidence interval, tstatistic, degrees of freedom and one and twotailed probabilities.
Example
Table 56 on p. 160 from Cohen, L. & M. Holliday (1983). Ten trainees on a management course have been rated on a personality measure Introversion and on an Attitude to Change scale. The null hypothesis “there is no correlation between these two rankings” is tested.
Open CORRCOEF and select Statistics 1 → Correlation Coefficients. Select Introversion and Attitude (C5 and C6) as variables and select only the Kendall Rank output option to obtain the following results:
Correlation Coefficients
For Introversion and Attitude

Correlation Coefficient 
Degrees of Freedom 
* Test Statistic 
1Tail Probability 
2Tail Probability 
Kendall Rank 
0.6286 

2.4545 
0.0071 
0.0141 
Kendall Rank with CC 
0.6519 

2.5455 
0.0055 
0.0109 

Lower 95% 
Upper 95% 
Kendall Rank 
0.0017 
0.9014 
Kendall Rank with CC 
0.0377 
0.9086 
* Zstatistic for Kendall rank, tstatistic otherwise
This result shows that there is a significant rank correlation at the 1% level, between the Introversion / extraversion rating and the Attitude to Change rating.
6.2.1.4. Point Biserial Correlation
This is an alternative to the linear (Pearson’s) correlation coefficient when the first variable is continuous and the second variable is binary. Let n_{p} and n_{q} be the respective numbers of Ps and Qs and n the total number of valid cases. There are two alternative ways of calculating the coefficient:
Using sample standard deviation:
where SD_{samp} is the sample standard deviation of the two samples combined:
Using population standard deviation:
where SD_{pop} is the population standard deviation of the two samples combined:
Before this release of UNISTAT, the version with sample standard deviation was used.
In both cases, the following tvalue is compared with the tdistribution:
The data for this test can be in one of the three types supported for Two Sample Tests. If the last data option Test Statistics are Given is selected the program will prompt for sizes, means and standard deviations of the two samples. Missing values are omitted by case and the degrees of freedom is adjusted accordingly.
Example 1: Point biserial correlation using sample standard deviation
Table 57 on p. 164 from Cohen, L. & M. Holliday (1983). Examination scores of on and off campus social work students is given in one column of the table and their residence pattern in a second column.
Open CORRCOEF and select Statistics 1 → Correlation Coefficients. Select Score (C7) as [Variable] and Off Campus (C8) as [Factor], and select the Point Biserial and Report Summary Statistics output options to obtain the following results:
Correlation Coefficients
Data variable: Score
Subsample selected by: Off Campus = 0,1

Valid Cases 
Missing 
Mean 
Standard Deviation 
0 
6 
0 
82.3333 
5.1251 
1 
4 
0 
65.0000 
4.0825 

Correlation Coefficient 
Degrees of Freedom 
* Test Statistic 
1Tail Probability 
2Tail Probability 
Point Biserial 
0.8480 
8 
4.5260 
0.0010 
0.0019 

Lower 95% 
Upper 95% 
Point Biserial 
0.4686 
0.9633 
* Zstatistic for Kendall rank, tstatistic otherwise
This result shows that there is a significant correlation at the 0.1% level between examinations scores and residence.
Example 2: Point biserial correlation using population standard deviation
Example 19.16 on p. 410 from Zar, J. H. (2010). The null hypothesis that there is no correlation between bloodclotting time and drug is tested.
Open CORRCOEF and select Statistics 1 → Correlation Coefficients. Select X1 (C12) as [Factor] and Y1 (C13) as [Variable], and select only the Point Biserial output option to obtain the following results:
Correlation Coefficients
Data variable: Y1
Subsample selected by: X1 = 0,1

Correlation Coefficient 
Degrees of Freedom 
* Test Statistic 
1Tail Probability 
2Tail Probability 
Point Biserial 
0.5983 
11 
2.4765 
0.0154 
0.0308 

Lower 95% 
Upper 95% 
Point Biserial 
0.8643 
0.0706 
* Zstatistic for Kendall rank, tstatistic otherwise
Since P < 0.05, we reject the null hypothesis.