Validation of climate models is like finding someone to cement your drive.
You ask one contractor, and they say they can do it, sometime between now and Christmas. Thatâ€™s a high level of uncertainty.
You ask another, and they say they can do it, but itâ€™s their first time. Thatâ€™s a low level of skill.
You ask another, and they say that they will do it, but the result is not going to be any better than what you have already, or may even be worst. Thatâ€™s an honest vendor, and a product not â€˜fit-for-useâ€™.
Model validation is very obvious when you put it in a familiar context. There is a level of service you expect from the money you have to spend. Any public servant involved in the procurement of services faces a similar situation to concreting oneâ€™s drive. Due diligence requires a check that models are fit-for-purpose.
I have just completed a critique of Australian drought models developed by the Bureau of Meteorology and CSIRO as used in the Drought Exceptional Circumstances Report. While they flagged that â€œthere is higher uncertainty with the rainfall data [than the temperature data]â€, they reported model forecasts of a large increase in the severity and frequency of drought over all regions of Australia, which have been widely quoted in the media and elsewhere.
Comparison of their drought models with historic data showed the in-sample fit of the models was lower than a simple average frequency of droughts over the last century. While the frequency of drought appears to have decreased in the last 100 years, the models showed a significant increase.
Knowing that temperature has increased in the last 100 years, would any reasonable person find models acceptable if they showed a significant decrease in temperature in the last 100 years? I think not. So why treat precipitation differently?
Returning to cementing the drive, here are a few more scenarios.
You ask a friend for a recommendation, and they say â€œDonâ€™t use X. The result was very disappointing.â€ You would be very cautious with that vendor. The IPCC regards that â€œPrecipitation in particular is not adequately simulated by the current IPCC models even at large scalesâ€¦â€
Or, you might look for a big name, a vendor with the largest ad in the Yellow Pages â€“ a vendor like CSIRO or BoM perhaps? A reputable vendor would refuse to provide a service that does not satisfy the service requirements of the customer. The short term gain is not worth the loss of good will. Similarly, when commissioned to provide forecasts for use in the policy area, modelers must be confidence of the performance of their models. The policy arena is different to research studies that are limited to the scientific journals and read only by scientific peers.
It was disappointing to hear of two misguided efforts to rehabilitate climate-modeling efforts in Australia. The first was training public servants in Victoria to fight climate skepticism. I think the greater need is for training public servants in specification of level-of-service requirements for climate models, and how to demand and understand model validation studies.
The second was a meeting of CSIRO and BoM to develop a ”national communication charter” for major scientific organizations and universities to better â€˜spruikâ€™ the evidence of climate change. If this is a faithful account, then it is an insult to the intelligence of the general public, many of whom are technically trained, and would be more convinced by solid validation studies and a history of successful forecasts.
If I were to convene a meeting, I would create a stakeholder working group to develop Australia-wide standards for climate model validation, with an aim to providing certification for climate models used in decision-making. The development of such industrial-strength standards would provide greater confidence in model forecasts, and provide a measure of protection for the CSIRO/BoM â€˜brandâ€™ from collateral damage when forecasts are perceived to fail.
When pouring a concrete slab for an airport tarmac, a contractor is bound by a host of specifications: the amount of steel rods, slab thickness, density and deflection to name a few. Failure to meet any of these specifications results in penalties. This example demonstrates the contrast between fit-for-research modeling and fit-for-policy modeling. In the former there is no penalty for lack of skill. In the latter there is an expectation of skill; there should be a penalty for lack of performance, and greater penalties for false claims that exaggerate ability.
At present there are no generally recognized standards for validating climate models. In spite of the published concerns of the professional statistical forecasting community, the standard practice is to use the â€˜best estimateâ€™ (the mean) of an â€˜ensemble of opportunityâ€™ (all 26 models included in the IPCC evaluation studies), without any specification of levels of performance. From my review of recent regional forecasting reports from drought, to flood and hurricane frequency to sea level change, it is not clear that climate scientists even know where to begin.