8.1.3. K-Means Cluster Analysis

This is a k-means algorithm to divide M points in N dimensions into K clusters. The user selects K initial points from the rows of the data matrix. The procedure applies an iterational algorithm which minimises the within-cluster sum of squares. See Hartigan, J. A. and Wong, M. A. (1979), p. 100.
The following output options are provided:

Cluster Table: The number of cases in each cluster, their percentages and the minimised sum of squares are displayed. The number of clusters formed is determined by the number of initial points selected.
the case is displayed.Final Cluster Centres: The k-means clustering algorithm computes centroids for each cluster. The final configuration is displayed in a table.
Cluster Graph: This is similar to the Cluster Graph for hierarchical methods. However, here it is also possible to display the cluster centres on the same graph. The centre for a cluster will be represented by a capital letter.

Example
Open MULTIVAR, select Statistics 2 → Cluster Analysis → K-Means Cluster Analysis, and select Perf, Info, Verbexp and Age (C1 to C4) as [Variable]s. Select R2, R4 and R8 as seeds at the next dialogue and accept the default number of maximum iterations to obtain the following results:
K-Means Cluster Analysis
Variables Selected: Perf, Info, Verbexp, Age
Cluster Table
|
Cluster |
Seed |
Cases |
Percentage |
SSQ |
|
1 |
2 |
3 |
33.33% |
220.3800 |
|
2 |
4 |
2 |
22.22% |
109.2200 |
|
3 |
8 |
4 |
44.44% |
140.7875 |
Cluster Membership
|
Observation |
Cluster |
|
1 |
3 |
|
2 |
1 |
|
3 |
2 |
|
4 |
1 |
|
5 |
3 |
|
6 |
3 |
|
7 |
2 |
|
8 |
3 |
|
9 |
1 |
Final Cluster Centres
|
Seed |
Perf |
Info |
Verbexp |
Age |
|
2 |
99.3333 |
10.6667 |
36.0000 |
7.8333 |
|
4 |
116.0000 |
10.5000 |
36.0000 |
7.8000 |
|
8 |
83.2500 |
8.0000 |
32.2500 |
6.6250 |
