### 5.3.1. Box-Whisker, Dot and Bar Plots

This procedure combines boxplot with dot and error bar plots. Multisample data can be entered in the form of multiple columns or data columns classified by factor columns. If at least one factor is selected, then a further dialogue will pop up asking for the combination of factor levels to be included. The data is plotted on the Y-axis (where the Scale Type can be one of linear, log base 10, log base e, log based to any user-defined value, logit, probit, gompit (cloglog) or loglog and the categories on the X-axis. Although an unlimited number of data series can be plotted, properties of only the first nine can be individually controlled on the Data Series dialogue that can be accessed either from the Edit → Data Series menu or by double-clicking on the graph area. The rest of the series will repeat the properties of the first nine in a circular fashion. The Apply to all variables check box allows you to apply the current variable’s settings to all selected variables.

Symbol type, symbol size, colour and Point Labels can be controlled for outlying points on Box and Whisker Plot for each data series individually.

The Edit → Width / Notch / Dots dialogue can be used to control the statistical parameters represented on the graph. The three check boxes in the Type panel allow drawing any combination of Box and Whisker Plot, Dot Plot and Error Bar Plot on the same graph. The other three frames on this dialogue are used to control the individual characteristics of each type of plot. The Confidence Level text box is included in this dialogue for the sake of convenience, although it is also available in the Variable Selection Dialogue. Changes made on this dialogue will apply to all data series.

#### 5.3.1.1. Box and Whisker Plot

A box and whisker plot conveys the following information:

**Bottom of the box****:** Lower quartile.

**Middle of the box:** Median.

**Top of the box:** Upper quartile.

**Box Width:** The variable box width conveys information
about the size of the sample. See below.

**Notch:** When there is a notch, it conveys information
about the dispersion of data about the median. See below.

**Lower Whisker:** Lower adjacent value. Any values below
this are outliers and are plotted individually. See below for alternative
methods.

**Upper Whisker:** Upper adjacent value. Any values above
this are outliers and are plotted individually. See below for alternative
methods.

On the Width / Notch / Dots dialogue, the first group of controls concerns the Box and Whisker plots.

**Width:** The width of boxes can be used to convey
information about sample sizes:

** Fixed:** No size information.

** Sqr(n):** The widths are proportional to the
square root of their sample size.

** Log(n):** The widths are proportional to the 10
based logarithm of their sample size.

** n:** The widths are proportional to their sample
size.

**Notch:** The extent of notches represents the following
dispersion measures:

** None:** A notch is not drawn.

** t-interval:**

** **

** **

where is the critical value from t-distribution with n – 1 degrees of freedom.

** Z-interval:**

** **

** **

** Standard Error:**

** **

** **

** Standard Deviation:** As above, but with sample
standard deviation.

** **

** **

** Variance:** As above, but with sample variance.

** **

** **

** Robust Confidence Interval****:** The robust standard error (SE*) is defined as:

** **

where IQR is the inter-quartile range and n is the sample size. The robust confidence interval is then defined as:

** **

** **

where is the critical value from the standard normal distribution (see McGill, R., Tukey, J. W. and Larsen, W. A. 1978).

**Whiskers:** These convey information about the
dispersion of data. Any values remaining outside the extent of whiskers are
called outliers.

** None:** No whiskers and outliers are plotted.

** Tukey:** This is he default method. The lower
whisker corresponds to the maximum of (i) lower quartile minus 1.5 times the inter-quartile range and (ii) the minimum observation and the
upper whisker to the minimum of (i) upper quartile plus 1.5 times the
inter-quartile range and (ii) the maximum observation.

** Min / Max:** Whiskers correspond to the minimum
and maximum of data series.

** Quantiles:** Whiskers correspond to the lower and
upper 95% quantiles by default. The significance level can be changed by the
user.

#### 5.3.1.2. Dot Plot

The second frame contains controls for dot plots.

**Type:** The dots can be plotted in four different ways.
The first two options will classify the observations into a specified number of
classes, like in a histogram. The latter two options will plot the dots at
their actual values, rather than classifying them into groups.

**Classified – left:**
Observations will be classified into groups and the dots will be
left-justified.

**Classified – centred:**
Observations will be classified into groups and the dots will be centred.

**Scatter – line:** The actual
values of observations will be plotted along a vertical line.

**Scatter – wide:** The actual
values of observations will be plotted and the overlapping dots will be
separated as much as possible.

**Number of Classes:** The classified dot plots are
essentially histograms and this parameter controls the number of classes (the
default is 20). The size of dots can be adjusted from the Edit → Data Series → Symbol panel to obtain
the desired appearance.

#### 5.3.1.3. Error Bar Plot

**Central Tendency and Confidence Interval:** The
following central tendency measures and their confidence limits can be drawn.

· Mean

· t-interval

· Z-interval

· Standard Error

· Standard Deviation

· Variance

· Geometric Mean

· t-interval

· Z-interval

· Harmonic Mean

· t-interval

· Z-interval

· Median

· Quartiles

· 95% Quantile

· Robust Confidence Interval

When Central Tendency is Mean and one of Standard Error or Standard Deviation options is selected, a dialogue pops up asking for a multiplier.

Error bars for standard error will then be calculated as:

** **

and for standard deviation:

** **

** **

where k is the multiplier defined by the user.