UNISTAT - the ultimate Excel statistics add-in

5.1.4. Sample Statistics

Sample Statistics

The Variable Selection Dialogue for this procedure offers three types of data to analyse (see 5.0.2. One-Sample Data Types). A text box is also provided on this dialogue to enter the size of the total population from which the sample is drawn. The default value of 0 means that the total population is not known and the program assumes an infinite population. A non-zero population value affects only the standard error of mean in output.

The Output variables in rows check box allows you to transpose the output matrix. This will be useful when you wish to use the output from this procedure (such as means and standard errors) for further analysis in other procedures.

Output for different data options differ slightly. For range.

For ungrouped data, the method used in computing the median, lower and upper quartiles is indicated in the output. This can be one of the six methods described in the previous section 5.1.3.1. Quantile Methods.

Sample Statistics

The following statistics can be calculated for ungrouped (option 1) and frequency and grouped data (options 2 and 3). Let n be the number of valid observations (i.e. excluding missing values) and fi the frequency of data point Xi given in column 2. Note that for ungrouped data fi = 1, i = 1, …, n.

Size: Number of cases (rows) in the sample, including missing values.

Missing: Number of missing cases in the sample. In frequency and grouped data a case is considered missing when either or both of value and frequency are missing.

Total Frequency:

Sample Statistics

      Note that N = n for ungrouped data.

Mean: The weighted arithmetic mean is:

Sample Statistics

Geometric Mean: The weighted geometric mean is:

Sample Statistics

Harmonic Mean: The weighted harmonic mean is:

Sample Statistics

      The following relationship should hold if Xi ≥ 0, i = 1, …, n:

Sample Statistics

Median: For ungrouped data, this is computed using the quantile method selected in step two of the Quantiles (Percentiles) procedure, as described in section 5.1.3.1. Quantile Methods.

      For frequency and grouped data, both value and frequency columns are sorted in ascending order according to values. For frequency data, half of total frequency is found and the median is calculated as above. For grouped data, median is calculated by interpolation as:

Sample Statistics

where:

·  L is the lower class boundary of the class containing the median,

·  the summation term is the sum of frequencies of all classes lower than the median class,

·  C is the size of median class interval and

·  N is the total frequency as defined above.

Lower Quartile: Calculations are similar to that of median, except for 25% quantile instead of 50%.

Upper Quartile: Calculations are similar to that of median, except for 75% quantile instead of 50%.

Interquartile Range: Difference between upper and lower quartiles.

Minimum: Smallest observed value in data (not available for grouped data).

Maximum: Greatest observed value in data (not available for grouped data).

Range: Difference between maximum and minimum values (not available for grouped data).

Sum: The weighted sum is:

Sample Statistics

Sum of Squares: The weighted sum of squares is:

Sample Statistics

Root Mean Square (Quadratic mean):

Sample Statistics

Unbiased Variance:

Sample Statistics

Unbiased Standard Deviation:

Sample Statistics

Standard Error of Mean:

Sample Statistics

Standard Error with Finite Population Correction: Available only when total population is known and it is greater than the total frequency.

Sample Statistics

Coefficient of Variation:

Sample Statistics

Variance:

Sample Statistics

Standard Deviation:

Sample Statistics

Sheppard’s Correction for 2nd Moment (Variance): Available for only grouped data:

      Sample Statistics

where C is the size of uniform class interval.

Mean Deviation:

Sample Statistics

3rd Moment About the Mean:

Sample Statistics

4th Moment About the Mean:

Sample Statistics

Unbiased 3rd Moment:

Sample Statistics

Sheppard's Correction for the 4th Moment: Available for only grouped data:

Sample Statistics

where C is the size of uniform class interval.

Moment Coefficient of Skewness:

Sample Statistics

An alternative definition of skewness is given in section 5.1.1. Summary Statistics.

Moment Coefficient of Kurtosis:

Sample Statistics

An alternative definition of kurtosis is given in section 5.1.1. Summary Statistics.

Pearson’s Second Coefficient of Skewness:

Sample Statistics

Example 1: Ungrouped data

Open PARTEST and select Statistics 1Descriptive Statistics → Sample Statistics. Select Haemoglobin, Platelets, log Leucocytes, and Systolic BP (C10 to C13) as [Variable]s, uncheck the Output variables in rows box and click [Finish].

Sample Statistics

Quantile Method: Simple Average

 

 

Haemoglobin

Platelets

log Leucocytes

Systolic BP

Size

 10.0000

 10.0000

 10.0000

 10.0000

Missing

 0.0000

 0.0000

 0.0000

 0.0000

Mean

-0.5300

-0.0300

-0.5900

 3.1000

Geometric Mean

*

*

*

*

Harmonic Mean

*

*

*

*

Median

-0.6000

 0.1000

-0.6500

 2.0000

Lower Quartile

-1.5000

-1.0000

-1.6000

-2.0000

Upper Quartile

 0.0000

 0.6000

 0.9000

 8.0000

Interquartile Range

 1.5000

 1.6000

 2.5000

 10.0000

Minimum

-2.4000

-2.2000

-3.2000

-6.0000

Maximum

 2.3000

 1.9000

 1.7000

 14.0000

Range

 4.7000

 4.1000

 4.9000

 20.0000

Sum

-5.3000

-0.3000

-5.9000

 31.0000

Sum of Squares

 22.0700

 13.3900

 25.1700

 437.0000

Root Mean Square

 1.4856

 1.1572

 1.5865

 6.6106

Unbiased Variance

 2.1401

 1.4868

 2.4099

 37.8778

Unbiased Standard Deviation

 1.4629

 1.2193

 1.5524

 6.1545

Standard Error of Mean

 0.4626

 0.3856

 0.4909

 1.9462

Coefficient of Variation

-2.6186

-38.5588

-2.4961

 1.8834

Variance

 1.9261

 1.3381

 2.1689

 34.0900

Standard Deviation

 1.3878

 1.1568

 1.4727

 5.8387

Mean Deviation

 1.1500

 0.9020

 1.2500

 4.9200

3rd Moment About Mean

 1.3179

-0.4318

-0.1544

 69.6720

4th Moment About Mean

 9.2971

 4.2938

 9.5652

 2527.7857

Unbiased 3rd Moment

 1.8304

-0.5998

-0.2144

 96.7667

Moment Coefficient of Skewness

 0.4930

-0.2790

-0.0483

 0.3500

Moment Coefficient of Kurtosis

 2.5060

 2.3981

 2.0334

 2.1751

Pearson's Skewness Coefficient

 0.1513

-0.3371

 0.1222

 0.5652

Example 2: Variables in rows

Continuing from the last example, go back to Variable Selection Dialogue, check the Output variables in rows box and click [Next]. From the Output Options Dialogue select only the last three options and click [Finish].

 

Sample Statistics

 

Moment Coefficient of Skewness

Moment Coefficient of Kurtosis

Pearson's Skewness Coefficient

Haemoglobin

 0.4930

 2.5060

 0.1513

Platelets

-0.2790

 2.3981

-0.3371

log Leucocytes

-0.0483

 2.0334

 0.1222

Systolic BP

 0.3500

 2.1751

 0.5652

Example 3: Frequency data

Open TIMESER, select Statistics 1Descriptive Statistics → Sample Statistics and select the second data option Column 1 contains Data and Column 2 contains Frequencies. Select Surface Area (C13) as [Column 1] and Blemishes (C14) as [Column 2] and enter 150 in the Total Population box. The following results are obtained:

Sample Statistics

Surface Area: contains data, Blemishes contains frequencies

 

 

Surface Area

Size

 20.0000

Missing

 0.0000

Total Frequency

 94.0000

Total Population

 150.0000

Mean

 0.8462

Geometric Mean

 0.8265

Harmonic Mean

 0.8070

Root Mean Square

 0.8653

Unbiased Variance

 0.0330

Unbiased Standard Deviation

 0.1817

Standard Error of Mean

 0.0187

Standard Error with Finite Population

 0.0115

Coefficient of Variation

 0.2136

Variance

 0.0327

Standard Deviation

 0.1807

Mean Deviation

 0.1443

3rd Moment About Mean

 0.0004

4th Moment About Mean

 0.0017

Unbiased 3rd Moment

 0.0004

Moment Coefficient of Skewness

 0.0635

Moment Coefficient of Kurtosis

 1.6210

Pearson's Skewness Coefficient

 0.1024