# Scale Invariance of the Aggregate

For computational, statistical or display reasons, daily data are often aggregated to a coarser time scale. This is done by splitting the sequence into subsets along a coarser index grid, and calculating a summary statistics such as the mean value at each segment.

Missing values cause problems when calculating the mean. In the R default, the presence of a single NA returns an NA for most arithmetic operations. There is an option to calculate the mean after omitting the NAs. In the first case, the calculated means are valid but data is lost when converted to NAs. In the second, no data is lost but the means deviate wildly when the data come from strongly cyclical series such as temperature.

Figure 1 shows the reduction in monthly aggregate data when na.rm=T.

Figure 2 shows the reduction when na.rm=T, with almost total loss on annual aggregation. While data is not lost with the option na.rm=F, the outliers at the start and end of the Rutherglen minimum data series illustrates its unexpected biasing effect.

The figures illustrate that a heterogeneous sequence is not ‘invariant’ with respect to aggregation using a mean. The only way to ensure invariance, which confers a degree of reliability under aggregation, is if the missing data are randomly distributed within each section of the course index.

Most studies define rules about the number of allowable missing values, but either these are not clearly stated, or use rules that o not guarantee invariance, such as a set number of missing values (eg. CAWCR).

Because of the invariance of heterogeneous data under aggregation, it is best to analyze data at their original resolution.

# Heterogeneous Weather Sequences

The recorded sequences of temperature and rainfall from weather stations is often strikingly heterogeneous, with many different formats, protocols, and disruptions in the records. The Australian temperature record we see is the product of smoothing algorithms used to produce graphic displays that hide this structure. The problem of parameter estimation must be approached using methods that approximate the ideal homogeneous case.

While a standard formats for water data exists (WDF) there do not appear to be standards for temperature data. The function sequences reads and detects two types of data file downloaded the Australian Bureau of Meteorology: the Climate Data Online (CDO) and the ACORN-SAT reference data set. You can use a wild card to identify the ones you want or a specific one.

CDO=sequences("../inst/extdata",stations="082039",maxmin=11,type="CDO",na.rm=T)
ACORN=sequences("../inst/extdata",stations="082039",maxmin="min",type="ACORN",na.rm=T)

The sequences function returns a ‘zoo’ series which is a particularly powerful time series structure in R. The zoo series can be combined on the union or intersection of their dates with the ‘merge’ command.

Zoo series can represent time of day as well with the ?YYYY-mm-dd hh:mm:ss? format, allowing separate maximum and minimum temperature series to be effectively combined to achieve a single daily temperature series.

Figure 1 is the plot of the Rutherglen minimum daily temperature in from the raw CDO and homogenized ACORN network. The difference between the raw CDO data and the ACORN series is in blue. These are the adjustments to the Rutherglen minimum series.

All of the sequences that match criteria can also be loaded with a command such as the following.

comp=sequences("../inst/extdata",maxmin=11,type="CDO")
compNotNAs=summary.sequences(comp)

The summary function returns descriptive statistics about the heterogeneity of single or multiple series such as the date of the first and last value, the number and proportion of NAs.

Figure 2 shows the number of active stations (non-NAs) in 176 stations in south-eastern Australian. Note the extremely uneven collection of weather data over time. Such extreme heterogeneity can easily bias analysis that is sensitive to the number of missing values over time.

For example, calculation of a mean temperature sequence could only done reliably if the missing values were distributed uniformly. If the weather stations that recorded during the latter 20th century tended to be situated inland where the weather is hotter, this would tend to bias a simple average warm over that period. If the sequences were standardized on a common time period such as the 60’s, there would be a bias between periods before and after the standard period.

Clearly, taking the mean, averaging, or any similar operation is fraught with danger with extremely heterogeneous sequences such as these.

# Anomalies, Breakouts and Homogenization

Anomalies are the secrets hidden in the output of temperature recorders, price of trades, or server loads. Anomalies represent trading opportunities, calls for resource reallocation or in the case of weather stations, need for instrument re-calibration or replacement. Detecting them has been of interest to twitter. Correcting them, known as homogenization, greatly criticized. What are the limits to to reliable detection of anomalies in real world data?

A sequence is a function $f: N \rightarrow R$ where $N$ is the natural numbers, usually a discrete time step and $R$ is a real number. A series is the progressive sum of the terms in a sequence $S_i = \sum y_i$ where $i=1 ... n$. A typical example of a series is the ‘random walk’ produced by the cumulative sum of random values. One occasionally sees more general series to model a given sequence, such the progression of higher order terms in a polynomial regression model $y_t = a_1 t + a_2 t^2 ... a_n t^n$ and the periodic terms of a Fourier series. A sequence space is a closed set of sequences; subtracting or adding sequences gives another valid sequence.

Real world sequences contain missing values or ‘gaps’, and truncated start and end dates. Such data series are called ‘heterogeneous’. Missing values are represented by the the ‘na’ from the R language, by augmenting the range of the sequence to $\{N,na\}$. One of the main questions I want to address is how the analysis can be done reliably in the presence of arbitrary missing values.

The series of posts is organized as follows. Section 2 describes operations on sequences and the improvement of detection limits with series instead of sequences. Section 3 introduces anomaly tests on spaces of sequences; the comparison of the target with regional neighbours adopted widely in climatology. Section 4 uses the package to try to find anomolies in the Rutherglen daily minimum temperature data set.

Rutherglen is a small town in north-eastern Victoria, Australia, known for wine production. The ACORN-SAT station number 082039 is based on the still-open Rutherglen Research station 082039 (Lat: 36.10S Lon: 146.51E Elevation: 175m) and has a virtually unbroken record of daily readings from the 7th of November 1912, apart for missing days in the earlier part of the record, and a gap with no records between 1960 and 1965.

This R package is motivated by exploration of the Rutherglens extreme trend divergence (image above from KensKingdom) between the raw Climate Data Online (CDO) version and the Australian temperature reference network (ACORN-SAT), a ‘homogenized’ reference network published by the Australian Bureau of Meteorology (BoM) in 2012. In the process I hope we develop a useful R package for a whole range of applications.

# WhyWhere – An R package implementation for species distribution modelling

This package implements the species modelling algorithm called WhyWhere documented in the article “Improving ecological niche models by data mining large environmental datasets for surrogate models” by David R.B. Stockwell in Ecological Modelling Volume 192, Issues 1–2, 15 February 2006, Pages 188–196. (PDF).

# Anomaly – a new R package for detecting and correcting instrument errors

This package performs operations known as “homogenization” on time series data that is subject to various non-natural anomalies such as instrument baseline changes and spikes.

Preliminary testing indicates that it has a considerably higher sensitivity to small baseline shifts than a Chow test. Below is a scatterplot of the size of a single shift on a series of length 10. These series and the shift both had a standard deviation of one, which is similar to an annual temperature series.

The Chow test – red dots I(0) – does not reliabily detect a change in level until around 1 degree C. However the new method – blue dots I(1) – reliably detects changes in level down to around 0.25 degrees C.

# Book: Niche Modeling

Using theory, applications, and examples of inferences, Niche Modeling: Predictions from Statistical Distributions demonstrates how to conduct and evaluate niche modeling projects in any area of application. It features a series of theoretical and practical exercises for developing and evaluating niche models.

Yesterday’s post noted the appearance of station summaries at the BoM adjustment page attempting to defend their adjustments to the temperature record at several stations. Some I have also examined. Today’s post compares and contrasts their approach with mine.

### Deniliquin

The figures below compare the minimum temperatures at Deniliquin with neighbouring stations. On the left, the BoM compares Deniliquin with minimum temperatures at Kerang (95 km west of Deniliquin) in the years around 1971. The figure on the right from my Deniliquin report shows the relative trend of daily temperature data from 26 neighbouring stations (ie ACORN-SAT – neighbour). The rising trends mean that the ACORN-SAT site is warming faster that the neighbour.

The BoMs caption:

Deniliquin is consistently warmer than Kerang prior to 1971, with similar or cooler temperatures after 1971. This, combined with similar results when Deniliquin’s data are compared with other sites in the region, provides a very clear demonstration of the need to adjust the temperature data.

Problems: Note the cherrypicking of a single site for comparison and the handwaving about “similar results” with other sites.

In my analysis, the ACORN-SAT version warms at 0.13C/decade faster than the neighbours. As the spread of temperature trends at weather stations in Australia is about 0.1C/decade at the 95% confidence level, this puts the ACORN-SAT version outside the limit. Therefore the adjustments have made the trend of the official long term series for Deniliquin significantly warmer than the regional neighbours. I find that the residual trend of the raw data (before adjustment) for Deniliquin is -0.02C/decade which is not significant and so consistent with its neighbours.

## Rutherglen

Now look at the comparison of minimum temperatures for Rutherglen with neighbouring stations. On the left, the BoM compares Rutherglen with the adjusted data from three other ACORN-SAT stations in the region. The figure on the right from my Rutherglen report shows the relative trend of daily temperature in 24 neighbouring stations (ie ACORN-SAT – neighbour). As in Deniliquin, the rising trends mean that the ACORN-SAT site is warming faster that the neighbour.

The BoMs caption is

While the situation is complicated by the large amount of
missing data at Rutherglen in the 1960s, it is clear that, relative to the other sites, Rutherglen’s raw minimum temperatures are very much cooler after 1974, whereas they were only slightly cooler before the 1960s.

Problems: Note the cherrypicking of only three sites, but more seriously, the versions chosen are from the adjusted ACORN-SAT. That is, the already adjusted data is used to justify an adjustment — a classic circularity! This is not stated in the other BoM reports, but probably applies to the other station comparisons. Loss of data due to aggregation to annual data is also clear.

In my analysis, the ACORN-SAT version warms at 0.14C/decade faster than the neighbours. As the spread of temperature trends at weather stations in Australia is about 0.1C/decade at the 95% confidence level, this puts the ACORN-SAT version outside the limit. Once again, the adjustments have made the trend of the official long term series for Deniliquin significantly warmer than the regional neighbours. As with Deniliquin, the residual trend of the raw data (before adjustment) is not significant and so consistent with its neighbours.

## Amberley

The raw data is not always more consistent, as Amberley shows. On the left, the BoM compares Amberley with Gatton (38 km west of Amberley) in
the years around 1980. On the right from my Amberley report is the relative trend of daily temperature to 19 neighbouring stations (ie ACORN-SAT – neighbour). In contrast to Rutherglen and Deniliquin, the mostly flat trends mean that the ACORN-SAT site is not warming faster than the raw neighbours.

The BoMs caption:

Amberley is consistently warmer than Gatton prior to 1980 and consistently cooler after 1980. This, combined with similar results when Amberley’s data are compared with other sites in the region, provides a very clear demonstration of the need to adjust the temperature data.

Problems: Note the cherrypicking and hand waving.

In my analysis, the ACORN-SAT version warms at 0.09C/decade faster than the neighbours. As the spread of temperature trends at weather stations in Australia is about 0.1C/decade at the 95% confidence level, I class the ACORN-SAT version as borderline. The residual trend of the raw data (before adjustment) is -0.32C/decade which is very significant and so there is clearly a problem with the raw station record.

## Conclusions

More cherrypicking, circularity, and hand-waving from the BoM — excellent examples of the inadequacy of the adjusted ACORN-SAT reference network and justification for a full audit of the Bureau’s climate change division.

# BoM publishing station summaries justifying adjustments

Last night George Christensen MP gave a speech accusing the Bureau of Meteorology of “fudging figures”. He waved a 28 page of adjustments around, and called for a review. These adjustments can be found here. While I dont agree that adjusting to account for station moves can necessarily be regarded as fudging figures, I am finding issues with the ACORN-SAT data set.

The problem is that most of the adjustments are not supported by known station moves, and many may be wrong or exaggerated. It also means that if the adjustment decreases temperatures in the past, claims of current record temperatures become tenuous. A maximum daily temperature of 50C written in 1890 in black and white is higher than a temperature of 48C in 2014, regardless of any post-hoc statistical manipulation.

But I do take issue with a set of summaries being released as blatant “cherry-picking”.

Scroll down to the bottom of the BoM adjustment page. Listed are station summaries justifying the adjustments to Amberley, Deniliquin, Mackay, Orbost, Rutherglen and Thargomindah. The overlaps with the ones I have evaluated are Deniliquin, Rutherglen and Amberley (see previous posts). While the BoM finds the adjustments to these stations justified, my quality control check finds problems with the minimum temperature at Deniliquin and Rutherglen. I think the Amberly raw data may have needed adjusting.

WRT Rutherglen, BoM defends the adjustments with Chart 3 (my emphasis):

Chart 3 shows a comparison of the raw minimum temperatures at Rutherglen with the adjusted data from three other ACORN-SAT stations in the region. While the situation is complicated by the large amount of missing data at Rutherglen in the 1960s, it is clear that, relative to the other sites, Rutherglen’s raw minimum temperatures are very much cooler after 1974, whereas they were only slightly cooler before the 1960s.

WRT Deniliquin, BoM defends the adjustments on Chart 3 (my emphasis):

Chart 3 shows a comparison of minimum temperatures at Kerang (95 km west of Deniliquin) and Deniliquin in the years around 1971. Deniliquin is consistently warmer than Kerang prior to 1971, with similar or cooler temperatures after 1971. This, combined with similar results when Deniliquin’s data are compared with other sites in the region, provides a very clear demonstration of the need to adjust the temperature data.

My analysis is superior to flawed the BoMs analysis in 3 important ways:
1. I compare the trend in Rutherglen and Deniliquin with 23 and 27 stations respectively, not 3 and 1 neighbouring stations respectively (aka cherry-picking).
2. I also use a rigorous statistical panel test to show that the trend of the Rutherglen minimum exceeds the neighbouring group by O.1C per decade, which is outside the 95% confidence interval for Australian stations trends — not a visual assessment of a chart (aka eyeballing).
3. I use the trends of daily data and not annual aggregates, which are very sensitive to missing data.

# Kerang: Where is the daily data?

Been looking forward to doing Kerang as I knew it was another dud series from ACORN-SAT. The report is here:

The first thing to notice in plotting up the time series data for the raw CDO and ACORN-SAT is that while the ACORN-SAT data goes back to 1910 the CDO data is truncated at 1962.

The monthly data, however, goes back almost to 1900. This is inexplicable as the monthly data is derived from the daily data! Here is proof that, contrary to some opinion pieces, all of the data to check the record is not available at the Bureau of Meteorology website, Climate Data Online.

The residual trends of ACORN-SAT are at benchmark and greatly exceeding benchmark, for maximum and minimum respectively.

While on the subject of opinion pieces, the statement from No, the Bureau of Meteorology is not fiddling its weather data:

Anyone who thinks they have found fault with the Bureau’s methods should document them thoroughly and reproducibly in the peer-reviewed scientific literature. This allows others to test, evaluate, find errors or produce new methods.

So you think skeptics haven’t tried? A couple of peer-review papers of mine on quality control problems in Bureau of Meterology use of models have not had a response from the Bureau in over 2 years. The sound of crickets chirping is all. Talk is cheap in climate science, I guess. Here they are:

Biases in the Australian High Quality Temperature Network

Critique of drought models in the Australian drought exceptional circumstances report (DECR)

# Scorecard for ACORN-SAT Quality Assurance – the score so far.

Three more quality tests of stations in the ACORN-SAT series have been completed:

The test measures the deviation of the trend of the series from its neighbours since 1913 (or residual trend). A deviation of plus or minus 0.05 degrees per decade is within tolerance (green), 0.05 to 0.1 is borderline (amber), and greater than than is regarded a fail (red) and should not be used.

 Maximum Minimum Station CDO ACORN-SAT CDO ACORN-SAT Rutherglen -0.02 -0.03 -0.05 0.14 Williamtown 0.08 -0.02 -0.13 -0.00 Deniliquin -0.05 0.03 -0.02 0.13 Amberley -0.01 -0.01 -0.32 0.09 Cape Otway Lighthouse -0.17 0.02 -0.05 0.01 Wilcannia -0.16 -0.06 -0.13 0.05 Kerang 0.02 0.05 -0.01 0.13

There are more inconsistent stations among the raw CDO data — as would be expected as it is “raw”. Howover the standout problems in this small sample are in the ACORN-SAT minimums.

Results so far suggest the Bureau of Meteorology has a quality control problem with its minimum temperatures, with almost all borderline residual trends and very large deviations from the neighbours in Rutherglen and Deniliquin.