Multiple Regression and Discriminant Analysis to Predict

Mar-Apr-May-Jun 2000 Rainfall in Northeast Brazil



contributed by Larry Greischar and Stefan Hastenrath

University of Wisconsin, Madison, Wisconsin





In the approach used at the University of Wisconsin to forecast March-to-June precipitation in the Nordeste, several predictors are used with stepwise multiple linear regression, linear discriminant analysis, and neural networking. The predictand includes 27 selected stations in the Nordeste (Hastenrath and Greischar, 1993), shown in Fig. 1. The forecasts shown here are made at one month lead--i.e., using data no later than January 2000. The potential predictors of March-June Nordeste rainfall (not all of which are necessarily used in a given prediction model) are listed in Table 1 along with their correlations with the predictand over the training period, 1921-57. Correlation coefficients are in hundredths, with one or two asterisks indicating significance at the 5% and 1% levels, respectively.

As shown in Table 1 and described in Hastenrath and Greischar (1993), March-to-June rainfall in the Nordeste is correlated positively with the first and negatively correlated with all the other predictors. This pre-season, predictor values have been:

(1) pre-season rain not known, (2) negative (i.e. northerly wind anomaly), (3) negative Pacific SST, (4) negative tropical Atlantic SST index (i.e. cold/warm SST anomalies to the North/South of the Atlantic equator), and (5) negative tropical Atlantic SST index for Nov-Dec-Jan. The meridional gradient of Atlantic SST, meridional component of surface wind, Pacific SST, and Atlantic SST index for Nov-Dec-Jan indicate a wetter than normal March-June 2000.

Table 2 shows skill evaluations for eight prediction models, each using a different combination of predictors listed above. The eighth model uses neural networking rather than stepwise multiple linear regression.

The only model used (SMR 7) indicates above average March-June Nordeste precipitation for 2000. The 1912-56 historical average mean and standard deviation for the 27 stations are 500 and 200 mm, respectively.

Using the same predictors, Nordeste rainfall was also predicted using linear discriminant analysis, in which five equiprobable categories of rainfall amount are defined and associated with predictor values using Bayes' theorem. In an individual forecast each of the categories is assigned a probability, given the pre-season values of the predictors. Table 11-2 in the March 1995 issue of this Bulletin shows, for predictor models 5 and 7, the five-by-five verification matrices obtained over the 1958-89 period using the earlier years to develop the models. The hit rates are 0.34 and 0.38, corresponding to Heidke skills of 0.18 and 0.22 and expected correlation skills of about 0.55 and 0.65. The quintile probability forecast for Mar-Apr-May-Jun 2000 using the prediction model 7 is listed in Table 3. Model 7 shows a maximum likelihood of a wetter than average 2000.

In summary, the meridional gradient of Atlantic SST, the meridional component of surface wind, the Pacific SST, and Atlantic SST for Nov-Dec-Jan point to above average March-June 2000 precipitation. Quantitatively, based on the one SMR and one LDA model, we predict above average precipitation for the 2000 rainy season (MJ index +0.86, comparable to the years 1973, 75, 88, 95, 96).

Acknowledgments: This prediction exercise by LG and SH at the University of Wisconsin relied on various real-time data sources. Andrew Colman, U.K. Meteorological Office, Bracknell, dispatched November-December 1999 and January 2000 SST data of the tropical Atlantic; NMC wind data of the tropical Atlantic and SST data of the equatorial Pacific for January 2000 were obtained from NOAA-CDC, Boulder, Colorado; and OLR data was obtained from the NOAA-CPC, Camp Springs Maryland. All of these contributions were crucial to the timely issue of the forecast.



References:

Hastenrath, S. and L. Greischar, 1993: Further work on the prediction of northeast Brazil rainfall anomalies. J. Climate, 6, 743-758.



Table 1
(1) Oct-Jan precipitation at the 27 predictand stations. +55**
(2) An index of Jan meridional surface wind component over the tropical

Atlantic, 30N-30S.

-35*
(3) An index of Jan SST in the equatorial Pacific. -11
(4) An index of Jan SST in the tropical Atlantic, 30N-30S. -57**
(5) An index of Nov-Dec-Jan SST in the tropical Atlantic, 30N-30S. -70**



Table 2 Skill (% Variance Explained)
Model

#/Type

Predictors

Used

Training

Period

1921-57

Forecast

Period 1

1958-89

Forecast

Period 2

1968-89

Rainfall

Forecast

Mar-Apr-May

-Jun 2000

1 SMR (1) 30 35 49
2 SMR (1),(2) 38 49 69
3 SMR (1),(4) 49 52 66
4 SMR (1),(2),(4) 44 58 74
5 SMR (1),(2),(3),(4) 50 61 74
6 SMR (1),(3),(5) 62 61 71
7 SMR (3),(5) 56 58 62 +0.86
8 NN (1),(2),(3),(4) 55 66 81

Table 2. Skill of eight prediction models for Mar-Apr-May-Jun Nordeste rainfall (expressed as percentage of predictand variance explained), followed by the forecast standardized Nordeste rainfall anomaly for Mar-Apr-May-Jun 2000. The model type is SMR (stepwise multiple regression) or NN (neural network), and predictors numbers (1)-(5) are as shown above.



Table 3
Model Predictors Q1

Very dry

Q2

Dry

Q3

Normal

Q4

Wet

Q5

Very wet

5 (1),(2),(3),(4)
7 (3),(5) 0.01 0.12 0.12 0.57 0.18

Table 3. Predicted quint probabilities.



Fig. 1. Locations of the 27 selected stations in the Nordeste, used as the predictand by Greischar and Hastenrath.