Consolidation of Multi Method Forecasts Application to monthly predictions of Pacific SST
by user
Comments
Transcript
Consolidation of Multi Method Forecasts Application to monthly predictions of Pacific SST
NCEP Climate Meeting, April 4, 2007 Consolidation of Multi Method Forecasts Application to monthly predictions of Pacific SST Malaquias Peña and Huug van den Dool Acknowledgments: Suru Saha retrieved and organized the data, Dave Unger and Peitao Peng provided discussion to the subject 1 DATA • Forecasting tools: 8 CGCMs, 1 Statistical model – NCEP CFS: 1981-2006, 15 membs, 9 leads – DEMETER : 1980-2001, 9 membs, 6 leads • • • • • • • ECMWF MPI MF UKMO INGV LODYC CERFAX – CPC’ Constructed Analog (CA) : 1956-2006, 12 membs,12 leads This is what all have in common: • Monthly Forecasts, leads 0 to 5 • Initial months: Feb, May, Aug, Nov • Length of retrospective forecasts: 21 years, 1981-2001 FOCUS: TROPICAL PACIFIC SST: 12.5 S TO 12.5 N 2 DEFINITIONS • Consolidation: Making the best single forecast out of a number of forecast inputs. • Objective consolidation necessary as large supply of forecasts are available. • If K is the number of participant forecast systems, ζ, predicting a particular target month with a given lead time, the consolidation is the following linear combination: K C i i i 1 • For convenience, systematic errors and observed climatology are removed in ζ. • The regression coefficients (weights), α, are based on past performance of the forecast system. • o is the verifying field (e.g. observation; climatology removed). Suppose there are N cases of retrospective forecasts, then one can train a consolidation method by comparing: K o , C i i , j 1,2,..N j j i 1 j j 3 OPTIMIZING WEIGHTS • Find weights, αi ,for each forecasting tool, ζi, that minimizes the (sum of square of) errors εj in Z o Where Z is a matrix whose columns are the forecasting tools and rows are the data points in the training period, o is the column vector containing the verifying field, and ε is a vector of errors. • Least square method (unconstrained regression): T SSE (Z o ) (Z o ) T 1 T UR (Z Z) Z o 4 ILL-POSED MATRIX PROBLEM UR (Z Z ) Z o 1 T T ( Z T Z ) 1 eigenvalues Nino 3.4 PNA NAO 1 8.4584 5.9156 3.6889 2 0.1763 0.8394 1.402 3 0.1516 0.7808 1.1173 4 0.0707 0.42 0.8759 5 0.0536 0.3488 0.6316 6 0.0384 0.2874 0.5277 7 0.0297 0.1919 0.3978 8 0.0186 0.139 0.2462 9 0.0027 0.0772 0.1126 2 i too large Corresponding weights for UR for lead 1, im 1 1 2 3 4 5 6 7 8 9 0.482 0.2532 -0.5526 -0.5615 0.0189 0.0348 0.018 0.0381 0.0488 5 RIDGE REGRESSION SSE (Z o)T (Z o) Minimize: Constrained to: T c leads to RID ( Z Z I ) Z o T ( Z Z I ) Z o 1 K T 1 * ( Z Z I ) b RIM RIW where T T 1 T Ridge Regression 1 * b oi i 1 2 i f K and f i 1 (DelSole, 2007) (ad hoc) oi i i2 • Van den Dool estimates such that the weights are small and stable • Many more ways to find it • Depends on characteristics of covariance matrix ZTZ 6 RIDGE REGRESSION RID λ RIM λ RIW λ • Model weights (αi, i=1..9) as a function of λ for three ridge consolidation methods. • Figure illustrates asymptotic values. Our methods stop at λ=0.5. • Unconstrained regression (λ=0) results in a wide range (including negative values) of weights. 7 CONSOLIDATION METHODS ASSESED 8 CROSS-VALIDATION 90 80 70 60 50 40 30 D1 D2 D3 D4 D5 D6 D7 CFS CA MM COR FRE RID RI2 RIM RIW UR Anomaly Pattern correlation over the tropical Pacific. Average for all leads and initial months. Empty bar: Full (dependent), filled bar: 3-yr out cross-validated. 9 GRIDPOINT BY GRIDPOINT PERFORMANCE 10 EQUATORIAL PACIFIC 11 WESTERN TROPICAL PACIFIC Trust in good models when performed well in a gridpoint. It goes to the opposite direction of the bad models 12 WESTERN TROPICAL PACIFIC MIXES CLOSEST NEIGHBORING GRIDPOINT Trust in good models when performed well in a 3x3 box of gridpoints. It goes to the opposite direction of the bad models 13 WESTERN TROPICAL PACIFIC DOUBLE PASS AND MIXES CLOSEST NEIGHBORING GRIDPOINT Trust less good models, damps towards climatology as negative weights are set to zero 14 INCREASING EFFECTIVE SAMPLE Tropical Pacific SST. AC average for all leads and initial months 1 2 3 4 AC 5 GRIDPOINT BY GRIPOINT 3X3 BOXES 5X5 BOXES ALL GRIDPOINTS IN THE DOMAIN GRIDPOINTS IN AND OUT DOMAIN Multimethods average Skill of most consolidation methods improve when effective sampling size increases 15 INCREASING EFFECTIVE SAMPLE Consistency: Percentage cases (leads and initial months) outperforming MM 100.0 90.0 80.0 70.0 em 1X1 9m 3x3 9m all_gr 60.0 50.0 40.0 30.0 COR FRE RID RI2 RIM RIW 16 RELATIVE OPERATIONING CURVES • Assess the ability to anticipate correctly the occurrence or non occurrence that SST anomalies will fall in the upper, middle and lower terciles. • Class limits defined by the observed SST during the training period • Probability information from the ensemble: counting the fraction of ensemble members that falls into the “above-normal”, “near-normal”, and “below-normal” categories, and interpreting this fraction as the probability that forecasts will fall in such categories. • Approach for the optimized weights: each ensemble member forecast is multiplied by normalized weights. Upper tercile Lead 3 Lower tercile 17 UPPER TERCILE 18 LOWER TERCILE 19 SUMMARY All the points below are for the particular case of SST anomalies in the tropical Pacific. • Forecasts arising from a combination of multiple models of similar skill generally outperform those from individual models but not UR after CV-3. • Even the simple average of multi-methods shows consistent improvement over individual participant models. • Over all and after cross-validation, sophisticated consolidation methods marginally improve over the simple average. • Increasing the effective sampling size increases the skill and consistency of consolidation methods. • Consolidation methods improve significantly over the multimethods average in the western Pacific. • Probabilistic assessment, as measured by ROC shows some improvement of consolidation methods over MM. Construction of the probability density function of the consolidation requires optimization. 20