UNISTAT - the ultimate Excel statistics add-in

8.1.3. K-Means Cluster Analysis

K-Means Cluster Analysis

This is a k-means algorithm to divide M points in N dimensions into K clusters. The user selects K initial points from the rows of the data matrix. The procedure applies an iterational algorithm which minimises the within-cluster sum of squares. See Hartigan, J. A. and Wong, M. A. (1979), p. 100.

The following output options are provided:

K-Means Cluster Analysis

Cluster Table: The number of cases in each cluster, their percentages and the minimised sum of squares are displayed. The number of clusters formed is determined by the number of initial points selected.

the case is displayed.

Final Cluster Centres: The k-means clustering algorithm computes centroids for each cluster. The final configuration is displayed in a table.

Cluster Graph: This is similar to the Cluster Graph for hierarchical methods. However, here it is also possible to display the cluster centres on the same graph. The centre for a cluster will be represented by a capital letter.

K-Means Cluster Analysis

Example

Open MULTIVAR, select Statistics 2Cluster Analysis → K-Means Cluster Analysis, and select Perf, Info, Verbexp and Age (C1 to C4) as [Variable]s. Select R2, R4 and R8 as seeds at the next dialogue and accept the default number of maximum iterations to obtain the following results:

K-Means Cluster Analysis

Variables Selected: Perf, Info, Verbexp, Age

Cluster Table

Cluster

Seed

Cases

Percentage

SSQ

 1

 2

 3

 33.33%

 220.3800

 2

 4

 2

 22.22%

 109.2200

 3

 8

 4

 44.44%

 140.7875

 

Cluster Membership

Observation

Cluster

1

 3

2

 1

3

 2

4

 1

5

 3

6

 3

7

 2

8

 3

9

 1

 

Final Cluster Centres

Seed

Perf

Info

Verbexp

Age

2

 99.3333

 10.6667

 36.0000

 7.8333

4

 116.0000

 10.5000

 36.0000

 7.8000

8

 83.2500

 8.0000

 32.2500

 6.6250

 

K-Means Cluster Analysis