Its time to address comments on my review of the CSIRO Drought Exceptional Circumstances Report. Thanks to Lazar for taking the time to provide the following feedback at Open Mind, placed at WikiChecks here. I have not yet received any comments from the authors, or Kevin Hennessy of CSIRO.
I think its clear that the core issue of the lack of skill of climate models at simulating frequency of extremely low rainfall is unaffected by Lazar’s points.
Why the Climate Audit / David Stockwell attack on CSIRO â€œDrought Exceptional Circumstances Reportâ€ is wrong.
The CSIRO report predicts increasing frequency and severity of exceptional temperature and rainfall events, over all seven regions of Australia for temperature, and three of seven regions for rainfall (no discernible changes in the others). An exceptional temperature event, in the context of drought, is an annual average temperature above the 95th percentile of observed temperatures during 1910-2007. An exceptional rainfall event is likewise a total annual rainfall below the 5th percentile for 1900-2007. This difference in periods is due to availability of reliable observational data for temperature and rainfall. Severity is measured as the area effected during an exceptional event. Predictions were made using an ensemble of 13 GCMs.
David Stockwell claimsâ€¦
all climate models failed standard internal validation tests for regional droughted area in Australia over the last century
The tests David Stockwell employed wereâ€¦
â€¦ correlating model predictions for individual years of exceptional rainfall with observed years of exceptional rainfall! This ignores noise (internal variability in the climate system and GCM climate simulations) and that the CSIRO report predicted frequency. Steve MicIntrye and the auditors repeat this mistake here, with the obligatory snark from Steve (â€œEven for Michael Mann, a correlation of -0.013 between model and observation wouldnâ€™t be enough. For verification, heâ€™d probably require at least 0.0005.â€) and a 100-word paragraph about the trouble involved in untarring a .tar archive.
Correlation, or R2 is a standard validation test provided for reference, and only one of a number of tests. The importance or otherwise of R2 has been debated extensively in the hockey-stick controversy. I don’t make a big deal of it because skill of the model is better demonstrated by drought frequency. If the correlation had been high, then that would have helped validate the models. So it is worth doing as it might have validated the models. As it is, the models explain less than 1% of the observed variance in extreme rainfall, but predicting frequency and intensity of drought is the main goal.
â€¦ comparing trends from linear regression. For each year of modelled (mean of 13 GCMs) and observed data he took the area effected, but for years when there was no exceptional event (i.e. most years) he used an â€˜area effectedâ€™ value of zero, resulting that the residuals are not even close to normally distributed. Still, he applied a t-test to the difference in observed and modelled trends. But the error term was calculated only as the standard deviation of the 13 GCM modelled trends. He ignored the error in calculating a trend itself, which when taken into account renders the observed and modelled trends as statistically insignificant (not different from zero) â€” unsurprising given the treatment of years not containing an exceptional event.
The important observation is that the trends in the models (increasing drought) are in the opposite direction to the trends in the observations (decreasing droughts). This shows the model trend is biased in the opposite direction to reality. It is unlikely that this pattern would be reversed in any alternative analysis.
â€¦ he claims to test â€œThe probability of significance of the difference between the observed trend and mean trend projected for the return period (returnp-p), the mean time between successive droughts at the given level.â€ and concludes â€œThis indicates the frequency of droughts in the models has no relationship to the actual frequency of droughtsâ€What he actually did was compare the mean for the entire period 1900-2007 of the number of years between exceptional events for modelled and observed data. Not trends.
Point taken, scrub the word trend from the paragraph, a product of 4am editing. The main point is that models fail to capture the frequency of droughts, as shown by the difference in the return period.
â€¦ he completely ignores the analysis of exceptional temperature events in the CSIRO report which incidentally show much better correlations between model and observed.
The definition of drought, is a period of extreme low rainfall. The report is about predicting drought. The main part of the report is predicting exceptionally low rainfall. Its not about predicting heatwaves. I fail to see the relevance of this comment.
â€¦ he claims that GCMs are calibrated on regional precipitation data. â€œStandard tests of model skill are either internal (in-sample) validation, where skill is calculated on data used to calibrate
the model, or external (out-of-sample) validation, where skill is calculated on held-back data. As external validation is the higher hurdle, poor internal validation blocks further use of the model. Here internal validation is performed
on the thirteen models over the period 1900 to 2007 for each of the seven Australian regions.â€ They are not.
A quasi-calibration of the climate models was performed when 13 of 23 models were selected for seasonal concordance. This step, provided with no details in the report, was presumably done for the years 1900-2007. The models were tested on observations from the years 1900-2007, the same years as the quasi-calibration. Hence, the test is rightly termed an internal validation. It is not an important point, and I could drop the terminology without loss.
This is the first time I am actually angry aboutâ€¦
Denialists pestering scientists.
And setting themselves as auditors in order to sell that disinformation.
Huh? I am a scientist with publications on climate change effects modeling in Nature and Bioscience. This comment is not relevant.
â€œKey claims of the CSIRO report do not pass obvious statistical test for â€œsignificanceâ€.â€ â€” Steve McIntyre.
â€œStudies of complex variables like droughts should be conducted with statisticians to ensure the protocol meets the objectives of the study.â€ == David Stockwell
â€œI donâ€™t think its fair to single out CSIRO. You need to identify the enemy â€” IMO bias and pseudoscience. There are targets for review everywhere. The public face of science has shifted from atom splitters to GHG accounting.â€ â€” David Stockwell
For a reasonable model-observation comparison, do read the CSIRO report especially figures 8 and 10.
Thanks (I think) to ST for pointing this out.
Figures 8 and 10 are graphics and without numeric precision. Figure 10 in particular bears out my assertion that droughts appear to be decreasing in reality, but the models show them increasing over the last 100 years.
So thanks for the review Lazar, I will take the corrections on board. Clearly you have not saved the report by demonstrating where the models show skill at predicting extreme rainfall frequency.
If the temperature keeps responding to increasing CO2 as slowly as it has, then these exaggerated effects studies — based mostly on cheap modifications of exaggerated warming scenarios in GCMs — will be mostly eradicated; that is 99% of existing global warming effects studies will become strictly irrelevant and higher standards will return to science. People may realize that making predictions about hard-to-predict questions — something that looked so easy using the GCM scenarios — is actually hard. To quote Lubos at The Reference Frame:
And perhaps, most people will prefer to say “I don’t know” about questions that they can’t answer, instead of emitting “courageous” but random and rationally unsubstantiated guesses.