My evaluation of the validity of the modelling in the CSIRO Drought Exceptional Circumstances Report, and the R code used to produce it, is available below. A pdf and invitation for review is posted on WikiChecks.

Tests of Regional Climate Model Validity in the Drought Exceptional Circumstances Report

## Abstract

In a statistical re-analysis of the data from the Drought Exceptional Circumstances Report, all climate models failed standard internal validation tests for regional droughted area in Australia over the last century. The most worrying failure was that simulations showed increases in droughted area over the last century in all regions, while the observed trends in drought decreased in five of the seven regions identified in the CSIRO/Bureau of Meteorology report. Therefore there is no credible basis for the claims of increasing frequency of Exceptional Circumstances declarations made in the report. These results are consistent with other studies finding lack of adequate validation in global warming effects modeling, and lack of skill of climate models at the regional scale.

### Directions for Doing Analysis Offline

1. Make sure you have a copy of R, and the library gdata (for reading xls files)

2. Save this script to your working directory.

3. Download the following data online from http://www.bom.gov.au/climate/droughtec:

Historical Data > Time series of percentage area below the 5th percentile

rain.5pc.tar

Projections Data > Time series of percentage area below the 5th percentile

rain.proj.5pc.tar

4. Save the files at a directory ../data/csiro and untar

tar xvf *

5. Run the script with the following:

source(“script.R”)

go()

David,

I’m not very fluent in R and might have missed something, but I didn’t see any test for the iid normality of the residuals. The significance seems to be based on this assumption.

Annual climate data often has autocorrelation, which exaggerates the findings of significance. This particular statistic, which has zero or small positive values in many years, and high positive values in some, is likely to deviate strongly from normality even without autocorrelation.

David,

I’m not very fluent in R and might have missed something, but I didn’t see any test for the iid normality of the residuals. The significance seems to be based on this assumption.

Annual climate data often has autocorrelation, which exaggerates the findings of significance. This particular statistic, which has zero or small positive values in many years, and high positive values in some, is likely to deviate strongly from normality even without autocorrelation.

Hi Nick, Thanks for opening the discussion! You always make good points. At the end of the paper I plot the distribution, and suggest it is Pareto (or even beta due to boundedness). To do an in depth study you would do that first. I did test the return period, which is more appropriate for extreme value statistics, and I would place more confidence in, and looked to have a better distribution, and it was also significantly different in the means, though iid tests would be desirable throughout.

I actually realized I didn’t specify a significance level, so that is one thing I will correct.

A full treatment would look at these things, and I would pay close attention to them if the results were marginal. But the message (or lack) in the data is very strong. They might as well be reading the future in tea leaves as predicting future droughts with regional climate models.

Hi Nick, Thanks for opening the discussion! You always make good points. At the end of the paper I plot the distribution, and suggest it is Pareto (or even beta due to boundedness). To do an in depth study you would do that first. I did test the return period, which is more appropriate for extreme value statistics, and I would place more confidence in, and looked to have a better distribution, and it was also significantly different in the means, though iid tests would be desirable throughout.

I actually realized I didn’t specify a significance level, so that is one thing I will correct.

A full treatment would look at these things, and I would pay close attention to them if the results were marginal. But the message (or lack) in the data is very strong. They might as well be reading the future in tea leaves as predicting future droughts with regional climate models.

David,

I didn’t mean the distribution of the points, but of the residuals. Whenever you do a statistical test for significance, you have a model where there is an underlying iid (independent and identically distributed) random variable. From its distribution, you decide wqhether the results could have arisen by chance. For regression, it is the residuals that are assumed to be iid, and usually, for the test, assumed normal.

Now I was surprised when you chose to compare regression fits, because neither the model nor observed results seem to be good candidates. Because the drought statistic is close to zero in many years, and large positive in others, the notion of random variations about a mean line seems unlikely to work, and then will not give a good test.

To be convincing, you really need to do test the residuals. Testing for autocorrelation is easy and so is the simplest remedy (eg Cochrane-Orcutt). To test normality, you can use a Jarque-Bera, although the remedy if that fails is not so clear.

David,

I didn’t mean the distribution of the points, but of the residuals. Whenever you do a statistical test for significance, you have a model where there is an underlying iid (independent and identically distributed) random variable. From its distribution, you decide wqhether the results could have arisen by chance. For regression, it is the residuals that are assumed to be iid, and usually, for the test, assumed normal.

Now I was surprised when you chose to compare regression fits, because neither the model nor observed results seem to be good candidates. Because the drought statistic is close to zero in many years, and large positive in others, the notion of random variations about a mean line seems unlikely to work, and then will not give a good test.

To be convincing, you really need to do test the residuals. Testing for autocorrelation is easy and so is the simplest remedy (eg Cochrane-Orcutt). To test normality, you can use a Jarque-Bera, although the remedy if that fails is not so clear.

I should back up this claim of residual test need with an authoritative quote

“On the contrary, temperature measurements are serially correlated to a degree that affects statistics such as measures of variability, particularly the standard deviation (SD). In a serial correlation the new temperature is determined in part by the previous one. Stated another way, there appears to be more information in the series than there really is. One of consequences is that the calculated SD is lower than it should be, making the confidence limits smaller than they should be.”

I think this drought data is even more correlated.

I should back up this claim of residual test need with an authoritative quote

“On the contrary, temperature measurements are serially correlated to a degree that affects statistics such as measures of variability, particularly the standard deviation (SD). In a serial correlation the new temperature is determined in part by the previous one. Stated another way, there appears to be more information in the series than there really is. One of consequences is that the calculated SD is lower than it should be, making the confidence limits smaller than they should be.”

I think this drought data is even more correlated.

Nick, Your points are taken, I don’t intend to defend regression fits in this case, as I said the statistics (droughts) is an extreme value over threshold type of thing, as the distribution suggests. Given that, I doubt Cochrane-Orcutt would be strictly appropriate either, I don’t know enough about it, but the extreme values are not like temperature (lots of zeros). You really need to get an extreme value expert onto it. The reason for the basic fit was to do the first of a number of basic tests, and running trends through is probably the most basic.

I think the return period is the best of the group as it differences the values, and so should reduce any autocorrelation. Only the return period does not give trend information, only an average frequency of each series, so a trend would be good too.

That said, when the message is strong in a number of tests, it doesn’t make much difference what tests you use. Enough people have shown that GCMs are not reliable for precipitation at the regional scale, that it shouldn’t need to be proven to the n-th degree over again.

Nick, Your points are taken, I don’t intend to defend regression fits in this case, as I said the statistics (droughts) is an extreme value over threshold type of thing, as the distribution suggests. Given that, I doubt Cochrane-Orcutt would be strictly appropriate either, I don’t know enough about it, but the extreme values are not like temperature (lots of zeros). You really need to get an extreme value expert onto it. The reason for the basic fit was to do the first of a number of basic tests, and running trends through is probably the most basic.

I think the return period is the best of the group as it differences the values, and so should reduce any autocorrelation. Only the return period does not give trend information, only an average frequency of each series, so a trend would be good too.

That said, when the message is strong in a number of tests, it doesn’t make much difference what tests you use. Enough people have shown that GCMs are not reliable for precipitation at the regional scale, that it shouldn’t need to be proven to the n-th degree over again.

Something like this I would think would be appropriate.

http://www.agu.org/pubs/crossref/2002/2001WR000575.shtml

“The statistical procedures, already fully established in the statistical analysis of survival data, convert the problem into one in which a generalized linear model is fitted to a power-transformed variable having Poisson distribution and calculates the trend coefficients (as well as the parameter in the power transform) by maximum likelihood.”

Something like this I would think would be appropriate.

http://www.agu.org/pubs/crossref/2002/2001WR000575.shtml

“The statistical procedures, already fully established in the statistical analysis of survival data, convert the problem into one in which a generalized linear model is fitted to a power-transformed variable having Poisson distribution and calculates the trend coefficients (as well as the parameter in the power transform) by maximum likelihood.”

Pingback: coupon websites

Pingback: wypozyczalnia lawet

Pingback: pokazy kulinarne

Pingback: kliknij link

Pingback: youth group fundraisers

Pingback: Canada Goose Parka Outlet

Pingback: air jordan shoes for sale

Pingback: zobacz tutaj

Pingback: Eleanor Leibowitz

Pingback: ugly Christmas sweater mall

Pingback: Cook Ebooks

Pingback: polska knajpka

Pingback: o masazach

Pingback: link

Pingback: witryna firmowa

Pingback: link

Pingback: witryna www