Neural Network Model Forecasts of the NINO3.4 Sea Surface Temperature
contributed by Benyang Tang *, William W. Hsieh * and Fredolin T. Tangang +
*Department of Earth and Ocean Sciences University of British Columbia, Vancouver, B.C., Canada
+Department of Marine Science, Faculty of Science & Natural Resources,
National University of Malaysia
We present here a neural network model to forecast Nino3.4. First, we give a brief description of model. We have implemented a procedure called bootstrap aggregating, or bagging, to increase the skills and the stability. Bagging (Breiman 1996) works as follows: First training pairs, consisting of the data at the initial time and the forecast target of certain months (leadtime) later, are formed. The available training pairs are separated into a training set and a test set. The test set is reserved for testing only and not used for training. The training set is used to generate an ensemble of neural network models; each member of the ensemble is trained by only a subset of the training set. The subset is drawn at random with replacement from the training set. The subset has the same number of training pairs as the training set; some pairs in the training set appear more than once in the subset, and about 37% of the training pairs in the training set are absent in the subset. The final model output is the average of the outputs from all members of the ensemble.
The advantage of bagging is to reduce the variance, or instability of the neural network. The error surface of neural network training is full of local minima; trainings with different initial weights and training data are usually trapped in different local minima. These local minima reflect partly the fitting to the regularities of the data and partly the fitting to the noise in the data. Bagging tends to cancel the noise part as it varies among the ensemble members, and tends to retain the fitting to the regularities of the data. The ensemble in our bagging neural network model has 30 members.
The neural networks were trained with the NINO3.4 index (calculated from the NOAA SST gridded data, ftp://nic.fb4.noaa.gov/pub/ocean/clim1/), and the COADS monthly sea level pressure (SLP) data of the tropical Pacific Ocean (Woodruff et al 1987, http://www.cdc.noaa.gov/coads/). The gridded SLP data and the NINO3.4 index were first summed into 3-month averages and the SLP data were reduced to a few EOF modes.
Three of the first seven EOF time series from SLP data show considerable variations in decadal or longer time scales, with the 3rd one showing a linear trend. We found that removing these trends degrades the forecast skills beyond 6 months significantly. We were surprised by this result, as the data contain only 3 or less cycles of these long-term variations, but somehow the NN models seem to be able to capture the patterns of them.
The neural networks in this forecast have 31 inputs and 5 hidden neurons. The NINO3.4 index, and the first 7 EOF time series of the SLP EOF of the initial month form the first 8 inputs. The first 7 EOF time series of 3 months, 6 months and 9 months before the initial month are also used as inputs. The last 2 inputs are a sine and a cosine of 12-month period, indicating the phase of the annual cycle.
We have designed a cross-validation scheme which estimated the forecast skill of the NN models. For each lead time from 3 months to 18 months and for each given year from 1950 to 1997, data from a window of 5 years, starting from January of the given year, were withheld. Training pairs were composed from the remaining data. After the model was trained, 12 forecasts, initialized from 12 consecutive months starting from November of the year before the given year, were made. Then the 5-year window was moved forward by 12 months and the procedure was repeated. The choice of 5-year windows is for the balance of avoiding the influence of the training data on the test data and allowing an efficient use of the available data. The forecast target month is at least 24 months away from the training data that follow. The overlapping between the forecast input data and the training data is legitimate as it also happens in real time forecasting.
All the forecasts for a given lead time were collected to form a continued forecast NINO3.4 time series. The two panels of Fig.1 compare the forecast NINO3.4 at leadtimes 6 months and 12 months with the observed NINO3.4. The Correlation coefficients between the forecast and observed NINO3.4 were calculated as skills for each decade from 1950s to 1990s, and for the whole period as well. The results are listed in Table 1.
Fig. 2 shows the forecasts at leadtimes of 3, 6, 9, and 12 months, using data up to March 1998. The forecasts indicate that the current warm condition has peaked and will taper off in the coming months. The 9- and 12-month forecasts also indicate that a La Nina event (cold condition in the tropical Pacific) will come by the end of 1998.
Table 1. The test correlation skills for different test periods
|
Test Period |
3-month |
6-month |
9-month |
12-month |
15-month |
|
1950-59 |
0.77 |
0.54 |
0.40 |
0.36 |
0.06 |
|
1960-69 |
0.81 |
0.66 |
0.59 |
0.52 |
0.43 |
|
1970-79 |
0.92 |
0.78 |
0.71 |
0.66 |
0.52 |
|
1980-89 |
0.87 |
0.76 |
0.67 |
0.71 |
0.78 |
|
1990-97 |
0.88 |
0.71 |
0.53 |
0.53 |
0.53 |
|
1950-97 |
0.86 |
0.69 |
0.57 |
0.52 |
0.42 |
References:
Breiman, L., 1997: Bagging predictions. Machine Learning, in press. Available at ftp://stat.berkeley.edu/users/pub/breiman
Tang, B., 1995: Periods of linear development of the ENSO cycle and POP forecast experiments. J. Climate, 8, 682-691.
Tangang, F.T., W.W. Hsieh and B. Tang, 1997: Forecasting the equatorial Pacific sea surface temperatures by neural network models. Climate Dynamics, 13, 135-147.
Tangang, F.T., W.W. Hsieh and , B. Tang, 1997: Forecasting regional sea surface temperatures in the tropical Pacific by neural network models, with wind stress and sea level pressure as predictors, J. Geophys. Res., in press.
Tangang, F.T., B. Tang, W.W. Hsieh and A. Monahan, 1997: Forecasting ENSO events: a neural network approach, J. Climate, in press.
Woodruff, S.D., R.J. Slutz, R.L. Jenne, and P.M. Steurer, 1987: A comprehensive ocean-atmosphere data set. Bull. Am. Meteorol. Soc., 6, 1239-1250.
Figure captions
Figure 1. Output of the cross-validation forecasting experiments at leadtimes of 6 months and 12 months
Figure 2. Forecasts of the neural networks at 3, 6, 9, and 12 month leadtimes.