Probabilistic Forecasting of NIÑO3 Using Statistical Models

contributed by Simon J. Mason

Scripps Institution of Oceanography, La Jolla, California

Forecasts of monthly NIÑO3.4 sea surface temperature anomalies with lead-times of up to 11 months were produced using predictive discriminant analysis, canonical variate analysis, and 4 forms of generalized linear models. Full details of the models are to be published in Journal of Climate (Mason and Mimmack 2001). The forecast presented here represents an average of the forecast probabilities from these 6 statistical models. The first five unrotated principal components of gridded monthly sea surface temperatures over the tropical Pacific (25°N-25°S, 110°E-70°) were used as the only predictors. Skilful forecasts of NIÑO3.4 sea surface temperature anomalies can be developed relatively simply using only prior temperatures in the region as predictors (Barnston and Ropelewski 1992; Penland and Sardeshmukh 1995; Latif et al. 1998).

Careful assessments of the operational levels of forecast skill have been made by using a retroactive forecast procedure: the models were trained over a 30-yr training period, and then used to produce 20 years of retroactive forecasts of monthly NIÑO3.4 sea surface temperature anomaly categories, with the models being updated every five years. Five categories of anomalies were defined ranging from "La Niña", through "cool", "normal", "warm", to "El Niño". Probabilities for each of the categories over the 20-yr retroactive period January 1981 to December 2000 were calculated. The training period was initially set as 30 years (1951-80), and retroactive predictions for the following five years were then made using the optimal model. After this 5-yr period the model was retrained over the period 1951-85, possibly selecting different variables and a different number of retained variables, and predictions for 1986-90 were made. This procedure was repeated until a set of 20 years of retroactive predictions had been made. At each stage, the definitions of the five categories were reset to ensure that the categories remained equi-probable a priori. While the categories are defined as equi-probable over the training periods, this is not necessarily the cases for the verifications over the independent period. For 1981-85, the verifications were categorized on the basis of the 1951-80 training period; the verifications for 1986-90 were categorized on the basis of the 1951-85 training period, etc..For the forecasts presented here, the training period was 1951-2000, and anomalies are defined with reference to this same period.

A combined forecast was calculated by averaging the forecast probabilities from the various models. No attempt was made to weight the probabilities from the different models by a measure of model skill, since ranked model performance is sensitive to the precise skill measure used, and can be conditional upon the actual outcome. Good reliability is demonstrated for forecasts of all categories except "normal" (Fig. 1).

Ranked probability skill scores (RPSS's) were calculated for each month separately, using the combined forecast probabilities, and comparing the forecasts to a strategy of climatology. The scores for six months are shown in Fig. 2, where they are compared to the skill of persistence forecasts. The seasonal dependence of skill is clearly apparent for both the model and the persistence forecasts. For forecasts of NIÑO3.4 from August there is positive skill out to March of the following year, although persistence outscores the model at short lead-times.

The forecast probabilities averaged across the six models are presented in the table below. Probabilities are highest for the "cool" category over most of the forecast period, suggesting that there is a reasonably strong probability that the cool conditions that have recently developed in the eastern Pacific will extend eastward over the next few months.



References:

Barnston, A. G., and C. F. Ropelewski, 1992: Prediction of ENSO using canonical correlation analysis. J. Climate, 5, 1316-1345.

Daan, H., 1985: Sensitivity of verification scores to the classification of the predictand. Mon. Wea. Rev., 113, 1384-1392.

Epstein, E. S., 1969b: A scoring system for probability forecasts of ranked categories. J. Appl. Meteor., 8, 985-987.

Huberty, C. J., 1994: Applied Discriminant Analysis. Wiley, 466 pp.

Mason, S. J., and G. M. Mimmack, 2001: Comparison of some statistical methods of probabilistic forecasting of ENSO. J. Climate, in press.

Penland, C., and P. D. Sardeshmukh, 1995: The optimal growth of tropical sea surface temperature anomalies. J. Climate, 8, 1999-2024.

Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

Figure captions:

Fig. 1.        Reliability diagram for retroactive combined forecasts at increasing lead-times of "La Niña" (solid thin line), "cool" (dashed thin line), "normal" (dotted line), "warm" (dashed thick line), and "El Niño" (solid thick line) conditions for the 20-year period January 1981 - December 2000. Forecasts at all lead-times and for all months are pooled. The histograms indicate the frequency of forecasts with probabilities in the ranges 0.0-0.05, 0.05-0.15, 0.15-0.25, …, 0.95-1.0. The y-axes range to 1700. The top histogram is for "El Niño" conditions, the second top for "warm" conditions etc..

Fig. 2.        Ranked probability skill scores for retroactive combined forecasts at increasing lead-times of monthly NIÑO3.4 sea surface temperature anomaly categories for the 20-year period January 1981 - December 2000. The skill scores are calculated with reference to a strategy of forecasting climatology. The black bars represent the scores for the models, and the dark (light) gray bars are for forecasts of persisted anomaly categories. The light gray bands indicate the April - May period, which approximates the "spring barrier" in predictability.

Month

Lead-

time

La Niña

Cool

Normal

Warm

El Niño

Sep 2001

0

0.123

0.475

0.273

0.088

0.042

Oct 2001

1

0.234

0.304

0.284

0.120

0.059

Nov2001

2

0.157

0.424

0.267

0.084

0.068

Dec2001

3

0.179

0.442

0.247

0.087

0.044

Jan 2002

4

0.251

0.410

0.236

0.060

0.043

Feb 2002

5

0.255

0.357

0.269

0.070

0.048

Mar2002

6

0.253

0.327

0.205

0.155

0.060

Apr 2002

7

0.338

0.282

0.172

0.114

0.093

May2002

8

0.297

0.284

0.145

0.214

0.060

Jun 2002

9

0.253

0.239

0.170

0.211

0.127

Jul 2002

10

0.231

0.283

0.207

0.171

0.119

Aug2002

11

0.200

0.297

0.192

0.160

0.152