Probabilistic Forecasting of NIÑO3 Using Statistical Models
contributed by Simon J. Mason
Scripps Institution of Oceanography, La Jolla, California
Forecasts of monthly NIÑO3.4 sea surface temperature anomalies with lead-times of up to 11 months were produced using predictive discriminant
analysis, canonical variate analysis, and 4 forms of generalized linear models. Full details of the models are to be published in Journal of Climate
(Mason and Mimmack 2001). The forecast presented here represents an average of the forecast probabilities from these 6 statistical models. The first
five unrotated principal components of gridded monthly sea surface temperatures over the tropical Pacific (25°N-25°S, 110°E-70°) were used as the only
predictors. Skilful forecasts of NIÑO3.4 sea surface temperature anomalies can be developed relatively simply using only prior temperatures in the
region as predictors (Barnston and Ropelewski 1992; Penland and Sardeshmukh 1995; Latif et al. 1998).
Careful assessments of the operational levels of forecast skill have been made by using a retroactive forecast procedure: the models were trained over a
30-yr training period, and then used to produce 20 years of retroactive forecasts of monthly NIÑO3.4 sea surface temperature anomaly categories, with
the models being updated every five years. Five categories of anomalies were defined ranging from "La Niña", through "cool", "normal", "warm", to "El
Niño". Probabilities for each of the categories over the 20-yr retroactive period January 1981 to December 2000 were calculated. The training period
was initially set as 30 years (1951-80), and retroactive predictions for the following five years were then made using the optimal model. After this 5-yr
period the model was retrained over the period 1951-85, possibly selecting different variables and a different number of retained variables, and
predictions for 1986-90 were made. This procedure was repeated until a set of 20 years of retroactive predictions had been made. At each stage, the
definitions of the five categories were reset to ensure that the categories remained equi-probable a priori. While the categories are defined as
equi-probable over the training periods, this is not necessarily the cases for the verifications over the independent period. For 1981-85, the verifications
were categorized on the basis of the 1951-80 training period; the verifications for 1986-90 were categorized on the basis of the 1951-85 training period,
etc..For the forecasts presented here, the training period was 1951-2000, and anomalies are defined with reference to this same period.
A combined forecast was calculated by averaging the forecast probabilities from the various models. No attempt was made to weight the probabilities
from the different models by a measure of model skill, since ranked model performance is sensitive to the precise skill measure used, and can be
conditional upon the actual outcome. Good reliability is demonstrated for forecasts of all categories except "normal" (Fig. 1).
Ranked probability skill scores (RPSS's) were calculated for each month separately, using the combined forecast probabilities, and comparing the
forecasts to a strategy of climatology. The scores for six months are shown in Fig. 2, where they are compared to the skill of persistence forecasts. The
seasonal dependence of skill is clearly apparent for both the model and the persistence forecasts. For forecasts of NIÑO3.4 from August there is positive
skill out to March of the following year, although persistence outscores the model at short lead-times.
The forecast probabilities averaged across the six models are presented in the table below. Probabilities are highest for the "cool" category over most of
the forecast period, suggesting that there is a reasonably strong probability that the cool conditions that have recently developed in the eastern Pacific
will extend eastward over the next few months.
References
Barnston, A. G., and C. F. Ropelewski, 1992: Prediction of ENSO using canonical correlation analysis. J. Climate, 5, 1316-1345.
Daan, H., 1985: Sensitivity of verification scores to the classification of the predictand. Mon. Wea. Rev., 113, 1384-1392.
Epstein, E. S., 1969b: A scoring system for probability forecasts of ranked categories. J. Appl. Meteor., 8, 985-987.
Huberty, C. J., 1994: Applied Discriminant Analysis. Wiley, 466 pp.
Mason, S. J., and G. M. Mimmack, 2001: Comparison of some statistical methods of probabilistic forecasting of ENSO. J. Climate, in press.
Penland, C., and P. D. Sardeshmukh, 1995: The optimal growth of tropical sea surface temperature anomalies. J. Climate, 8, 1999-2024.
Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.
Figure captions:
Fig. 1. Reliability diagram for retroactive combined forecasts at increasing lead-times of "La Niña" (solid thin line), "cool" (dashed thin line),
"normal" (dotted line), "warm" (dashed thick line), and "El Niño" (solid thick line) conditions for the 20-year period January 1981 - December 2000.
Forecasts at all lead-times and for all months are pooled. The histograms indicate the frequency of forecasts with probabilities in the ranges 0.0-0.05,
0.05-0.15, 0.15-0.25, …, 0.95-1.0. The y-axes range to 1700. The top histogram is for "El Niño" conditions, the second top for "warm" conditions etc..
Fig. 2. Ranked probability skill scores for retroactive combined forecasts at increasing lead-times of monthly NIÑO3.4 sea surface temperature anomaly categories for the 20-year period January 1981 - December 2000. The skill scores are calculated with reference to a strategy of forecasting climatology. The black bars represent the scores for the models, and the dark (light) gray bars are for forecasts of persisted anomaly categories. The light gray bands indicate the April - May period, which approximates the "spring barrier" in predictability.
|
Month |
Lead- time |
La Niña |
Cool |
Normal |
Warm |
El Niño |
|
Sep 2001 |
0 |
0.123 |
0.475 |
0.273 |
0.088 |
0.042 |
|
Oct 2001 |
1 |
0.234 |
0.304 |
0.284 |
0.120 |
0.059 |
|
Nov2001 |
2 |
0.157 |
0.424 |
0.267 |
0.084 |
0.068 |
|
Dec2001 |
3 |
0.179 |
0.442 |
0.247 |
0.087 |
0.044 |
|
Jan 2002 |
4 |
0.251 |
0.410 |
0.236 |
0.060 |
0.043 |
|
Feb 2002 |
5 |
0.255 |
0.357 |
0.269 |
0.070 |
0.048 |
|
Mar2002 |
6 |
0.253 |
0.327 |
0.205 |
0.155 |
0.060 |
|
Apr 2002 |
7 |
0.338 |
0.282 |
0.172 |
0.114 |
0.093 |
|
May2002 |
8 |
0.297 |
0.284 |
0.145 |
0.214 |
0.060 |
|
Jun 2002 |
9 |
0.253 |
0.239 |
0.170 |
0.211 |
0.127 |
|
Jul 2002 |
10 |
0.231 |
0.283 |
0.207 |
0.171 |
0.119 |
|
Aug2002 |
11 |
0.200 |
0.297 |
0.192 |
0.160 |
0.152 |