UNISTAT - the ultimate Excel statistics add-in

9.1. Box-Jenkins ARIMA

9.1.0. Overview

ARIMA stands for Auto Regressive Integrated Moving Average model. So called, because the model fits autoregressive and moving average parameters to a transformed (differenced) time series and integrates back to the original scale before forecasts are generated. The differencing transformation makes use of B, the backshift operator, which shifts the subscript of a time series observation backwards in time by one period.

      Time Series Analysis-Box-Jenkins ARIMA

      Time Series Analysis-Box-Jenkins ARIMA

Select the column to analyse by clicking on [Dependent]. This column should not contain any missing values.

9.1.1. Differencing Input Options

Time Series Analysis-Box-Jenkins ARIMA

The Differencing Input Options dialogue is for entering the parameter values used in transforming the original data. The data can be transformed by differencing, taking logs, raising to a power and adding an offset to it.

     Time Series Analysis-Box-Jenkins ARIMA

Nonseasonal Differencing: The degree of differencing across whole series (d).

Seasonal Differencing: The degree of differencing between points with seasonal period units apart in the series (D).

Seasonal Period: The number of time units per season (s).

Lambda: This is a coded value which determines any logarithmic or power transformation. This is performed before any differencing:

= 1 No Effect Time Series Analysis-Box-Jenkins ARIMA

= 0 Log of series Time Series Analysis-Box-Jenkins ARIMA

else Time Series Analysis-Box-Jenkins ARIMA

Offset: The value of offset is added to every value in the time series. This is used to allow taking logarithms of a series including negative values.

Maximum Lag: This is the maximum lag to calculate in correlation displays.

You can apply any transformation to the series during the data preparation phase. In this case, forecasts will be generated in the transformed scale. Transformations made within ARIMA are reversed to give forecasts in the original scale.

The program then displays a dialogue with two options. The Fit Model option proceeds with the next step in the analysis (where the ARIMA model is selected.) and the Differencing Output Options gives access to the intermediate results.

Time Series Analysis-Box-Jenkins ARIMA

The Fit Model option should not be selected until the data series has been transformed into a stationary series.

9.1.2. Differencing Output Options

Time Series Analysis-Box-Jenkins ARIMA

Differencing Output Options should be used to help set the transformation values before the model is estimated. You may move between this dialogue and the second dialogue a number of times before the series is transformed appropriately. The input series for the ARIMA model must be stationary. A stationary series has a constant mean, variance and autocorrelation. Also, the Autocorrelation Function and Partial Autocorrelation Function should give an idea about the number of parameters to fit to the model.

Character Plot of Series: The transformed series is displayed in the form of a character plot together with the values.

Autocorrelation Function: The Autocorrelation Function (ACF) displays the autocorrelations of the transformed series. The autocorrelation represents the correlation between points in the series at displacement lag. The number of autocorrelations displayed is controlled in the Differencing Input Options dialogue.

      In ARIMA modelling the series is required to be stationary. If the ACF either cuts off fairly quickly or dies down fairly quickly, then the time series values should be considered stationary. If the ACF dies down extremely slowly, then the time series values should be considered non stationary and some differencing will be required.

      The ACF should be examined to decide which model to fit in the final stages.

Partial Autocorrelation Function: The Partial Autocorrelation Function (PACF) displays the partial autocorrelations of the transformed series. The partial autocorrelations represent the correlation between points in the series at displacement lag, with the effects of the intervening observations eliminated. Hence the partial autocorrelation at lag 1 is equivalent to the autocorrelation at lag 1. The number of autocorrelations displayed is controlled in the Differencing Input Options dialogue.

      The PACF should be examined to decide which model to fit in the final stages.

Hi-Res Plot of Series: This displays a graphical view of the transformed series.

Example

Table 3.1 on p. 83 from Bowerman, Bruce L. & Richard T. O’Connell (1987). Data on monthly Hotel Room Averages for 1973-1986 are given.

Open TIMESER, select Statistics 2ARIMA and Room Averages (C1) as [Variable]. At the Differencing Input Options dialogue enter:

·      0 Nonseasonal Differencing

·      1 Seasonal Differencing

·      12 Seasonal Period

·      0 Lambda (0 Log(X), else X^Lambda)

·      0 Offset (minimum 480)

·      25 Maximum Lag

Check all differencing output option boxes to obtain the following results:

ARIMA

Character Plot of Series: Room Averages

Row

X(t)

-0.0263                                               0.0831

 1

 0.0334

                                *                          

 2

 0.0020

               *                                            

 3

 0.0465

                                       *                   

 4

 0.0357

                                 *                         

 5

 0.0484

                                        *                  

 …

 

Autocorrelations: Room Averages

Lag

Correlation

Standard Error

-1.0000                                          1.0000

  1

 0.1933

 0.0801

                         (   *****)*                       

  2

 0.0244

 0.0830

                         (   **   )                        

  3

-0.2442

 0.0830

                      ***(****    )                        

  4

-0.1515

 0.0875

                        (*****     )                       

  5

-0.2119

 0.0892

                       *(*****     )                       

 

Partial Autocorrelations: Room Averages

Lag

Correlation

Standard Error

-1.0000                                          1.0000

  1

 0.1933

 0.0801

                         (   *****)*                       

  2

-0.0135

 0.0801

                         (   *    )                        

  3

-0.2560

 0.0801

                      ***(****    )                        

  4

-0.0622

 0.0801

                         (  **    )                        

  5

-0.1760

 0.0801

                        *(****    )                        

 

Time Series Analysis-Box-Jenkins ARIMA

9.1.3. Model Fitting

Time Series Analysis-Box-Jenkins ARIMA

The Model Fitting dialogue should only be used when the time series is considered stationary. The following guidelines can be used to help choose an ARIMA model to fit. It is often possible to try different models on the same data.

9.1.3.1. Seasonal and Nonseasonal Operators

Nonseasonal Operators (p and q)

The ACF has spikes at lags 1, 2, ..., r and cuts off after lag r, and the PACF dies down; use q = r and p = 0.

The ACF dies down and the PACF has spikes at lags 1, 2, ..., r and cuts off after lag r; use q = 0 and p = r.

The ACF has spikes at lags 1, 2, ...,r and cuts off after lag r, and the PACF has spikes at lags 1, 2, ... ,s and cuts off after lag s; use q = r and p = s.

The ACF contains small autocorrelations at all lags and the PACF contains small autocorrelations at all lags; use q = 0 and p = 0.

The ACF dies down and the PACF dies down; use p = 1 and q = 1.

Seasonal Operators (P and Q)

The previous guidelines apply to P and Q, but only consider autocorrelations at s, 2s, 3s, ... where s is the seasonal period.

9.1.3.2. Model Fitting Parameters

The Model Fitting dialogue requires inputting the following parameters:

Overall Constant: If the overall constant value is non zero, then Time Series Analysis-Box-Jenkins ARIMA

Nonseasonal AR Parameters: The nonseasonal AR

Nonseasonal MA Parameters: The nonseasonal MA

Seasonal AR Parameters: The seasonal AR parameter

Seasonal MA Parameters: The seasonal MA parameter

Backforecasts: This is the number of backforecasts generated before the model is fitted. If this value is zero then no backforecasts are generated.

Maximum Number of Iterations: The maximum number of iterations is the number of iterations allowed before the model declares non convergence.

The model fitted is given by the following equations:

   Time Series Analysis-Box-Jenkins ARIMA

   Time Series Analysis-Box-Jenkins ARIMA

where:

·      Time Series Analysis-Box-Jenkins ARIMA is the overall constant.

·      Time Series Analysis-Box-Jenkins ARIMA is the AR operator.

·      Time Series Analysis-Box-Jenkins ARIMA is the MA operator.

·      Time Series Analysis-Box-Jenkins ARIMA is the seasonal AR operator.

·      Time Series Analysis-Box-Jenkins ARIMA is the seasonal MA operator.

·      Time Series Analysis-Box-Jenkins ARIMA is the seasonal white noise.

·      Time Series Analysis-Box-Jenkins ARIMA is the white noise and assumed Time Series Analysis-Box-Jenkins ARIMA

The model is fitted by an iterative least squares method. The output options are accessed by selecting the ARIMA Results from the following dialogue.

Time Series Analysis-Box-Jenkins ARIMA

9.1.4. Model Output Options

Time Series Analysis-Box-Jenkins ARIMA

When a model has been fitted, you will have the following output options:

Model Results: The number of iterations made and the transformation used are displayed. For each fitted parameter the estimated value, the standard error and the t‑value are displayed.

Parameter Covariance Matrix: A table of the covariance between each fitted parameter is displayed.

Parameter Correlation Matrix: A table of the correlation between each fitted parameter is displayed.

Plot of Residuals: This displays the residual values in a table and allows them to be saved back to the data matrix.

Residual Autocorrelation: This displays the Autocorrelation Function of the residuals. The residuals should be unrelated because the model should account for the relationship in the time series data. If the residuals are unrelated then the autocorrelations of the residuals should be small. The Ljung-Box statistic (see below) is a test of the residual autocorrelations.

Ljung-Box Statistic: This displays the Ljung-Box statistic, the degrees of freedom and the associated chi-square probabilities at various values up to the lag. The Ljung-Box statistic is a test of the relationship between the residuals. A large value shows the residuals to be related, and hence the model being inadequate.

Example

Following the example in Differencing Output Options, select Fit Model to select an ARIMA model. On the Model Fitting dialogue enter:

·      1 Overall Constant (0 No, Else Yes)

·      3 Nonseasonal AR Parameters (P)

·      0 Nonseasonal MA Parameters (Q)

·      0 Seasonal AR Parameters (Ps)

·      1 Seasonal MA Parameters (Qs)

·      0 Backforecasts

·      200 Maximum Number of Iterations

·      0.0001 Tolerance

On the Model Output Options dialogue check only the Model Results box to obtain the following results:

ARIMA: Fit Model

Model Results

Transformation: X(t) = (1-B^12) log(Room Averages)

 

Parameter

Estimate

Std error

t ratio

Overall Constant

 0.02699

 0.01690

 1.5976

(AR) P(1)

 0.26089

 0.06798

 3.8380

(AR) P(2)

 0.15688

 0.06293

 2.4929

(AR) P(3)

-0.23467

 0.07074

-3.3175

(SMA) Qs(1)

 0.51389

 0.07311

 7.0293

 

Number of Iterations =

 148 (Converged)

Seasonal Period =

 12

 

The numbers obtained here are not identical to the ones given in the book, though the general characteristics of the fitted models are similar. This is due to the highly iterative nature of the estimation process and the results may differ from one implementation to the other.

9.1.5. Forecasting

9.1.5.1. Forecasting Input Options

Time Series Analysis-Box-Jenkins ARIMA

This dialogue requests the following parameters:

Number of Forecasts: This is the number of forecasts to be generated.

Forecast Origin (<0, Offset): The forecast origin determines the location in the series at which the forecasts will start. If you enter a positive value, this is used as the forecast origin. If 0 is entered, then the last point in the series is used as the origin. If a negative value is entered, then this value is used as an offset from the last point. For instance, -1 represents the penultimate point as the forecast origin.

Confidence Level: The forecasts will be given with confidence intervals at this level. The value must be greater than 0 and less than 1. Typical values are 0.95, 0.99 or 0.9.

9.1.5.2. Forecasting Output Options

Time Series Analysis-Box-Jenkins ARIMA

When the above parameters have been specified, the following output options will be available:

Forecast Table: A table of forecasts and confidence intervals from the given forecast origin is displayed.

Character Forecast Plot: A character plot of the original data, lead 1 forecasts and forecasts from the forecast origin are displayed.

Plot of Forecast: A graphical display of the original data, lead 1 forecasts and forecasts from the forecast origin is generated. These are the same values as the Character Forecast Plot.

Example

Following the example in Model Output Options dialogue select Forecasting. On the Forecasting Input Options dialogue enter:

·      24 Number of Forecasts

·      168 Forecast Origin (<0, Offset from last)

·      0.95 Confidence Level

Select the Forecast Table output option to obtain the following results:

ARIMA: Forecasting

Forecast Table: Room Averages

Forecasts with Origin at 168

 

Row

Forecast

Lower 95%

Upper 95%

 169

 840.0521

 808.0632

 873.3075

 170

 771.1056

 741.7421

 801.6315

 171

 777.0039

 747.4158

 807.7633

 172

 872.1992

 838.9860

 906.7271

 173

 858.4281

 825.7393

 892.4109

 174

 982.0313

 944.6357

 1020.9071

 175

 1154.9294

 1110.9500

 1200.6498

 176

 1181.2955

 1136.3121

 1228.0598

 177

 902.9520

 868.5678

 938.6974

 178

 903.5714

 869.1636

 939.3412

 179

 783.1983

 753.3743

 814.2029

 180

 892.2127

 858.2375

 927.5330

 181

 860.6177

 823.8597

 899.0157

 182

 794.2411

 760.3182

 829.6776

 183

 804.2494

 769.8990

 840.1324

 184

 903.2180

 864.6406

 943.5167

 185

 888.6314

 850.6769

 928.2792

 186

 1015.3943

 972.0257

 1060.6980

 187

 1193.5981

 1142.6182

 1246.8526

 188

 1220.5764

 1168.4441

 1275.0345

 189

 933.1099

 893.2557

 974.7422

 190

 933.8563

 893.9703

 975.5220

 191

 809.5330

 774.9569

 845.6517

 192

 922.2238

 882.8345

 963.3704

 

Time Series Analysis-Box-Jenkins ARIMA