6.5.9. Kappa Test for InterCategory Variation
Cohen’s Kappa is a measure of agreement between two or more raters classifying a sample of items into one of k mutually exclusive and exhaustive unordered categories. Three versions of Kappa test are supported. The first two can be accessed under the menu option for multisample nonparametric tests and the third appears among the statistics for square crosstabulations (see 6.6.2.2. R x C Table Statistics). You can also use the Intraclass Correlation Coefficients procedure to compare raters under up to six different sets of assumptions.
The first version is a generalised implementation of Kappa test for measuring agreement among many raters (or the intercategory agreement) introduced by Fleiss, J. L., (1971). The algorithm takes into account the correction in computing large sample variances by Fleiss, J. L., Nee, J. C. M. and Landis, J. R. (1979).
The data for this test is assumed to be in the form of a table where the rows represent subjects and the columns represent categories of classification. A fixed number of observers, say n, are assumed to rate each case. A rating means assigning the value 1 to a category for a subject. Therefore, each subject (row) should have n ratings and row totals should be equal to n. If this condition is not met the program will display a message and abort the procedure. Any rows containing one or more missing values are omitted.
If the data has not been formed into a frequency table, you can use the CrossTabulation procedure to generate the frequency counts and the Kappa test simultaneously (see 6.6.2.2. R x C Table Statistics).
Example
The data is taken from Fleiss, J L (1971).
Open NONPARM2, select Statistics 1 → Nonparametric Tests (Multisample) → Kappa Test InterCategory Variation and select Category1 to Category5 (C26 to C30) as [Variable]s, to obtain the following results:
Kappa Test (InterCategory Variation)

Total 
Pexp 
Pobs 
Kappa 
ZStat 
Prob 
Category1 
26 
0.1444 
0.3538 
0.2448 
5.3335 
0.0000 
Category2 
26 
0.1444 
0.3538 
0.2448 
5.3335 
0.0000 
Category3 
30 
0.1667 
0.6000 
0.5200 
11.1723 
0.0000 
Category4 
55 
0.3056 
0.6327 
0.4711 
10.1355 
0.0000 
Category5 
43 
0.2389 
0.6698 
0.5661 
12.1506 
0.0000 
Total 
180 
0.2199 
0.5556 
0.4302 
17.9253 
0.0000 
Standard Deviation = 
0.0244 
95% Confidence Interval = 
0.3825 <> 0.4780 