# Assumptions for linear regression

One of the main assumptions of linear regression is, ahem, linearity. Here is an example drawn from dendroclimatology, the reconstruction of past climates using tree rings, of the trouble one can get into by blindly assuming linearity. This subject was dealt with some time ago at ClimateAudit Upside-Down Quadratic Proxy Response.

From the Summary of chapter 9 of my book, niche-modeling-chap-9

9.2 Summary
These results demonstrate that procedures with linear assumptions are unreliable when applied to the non-linear responses of niche models. Reliability of reconstruction of past climates depends, at minimum, on the correct specification of a model of response that holds over the whole range of the proxy, not just the calibration period. Use of a linear model of non-linear response can cause apparent growth decline with higher temperatures, signal degradation with latitudinal variation, temporal shifts in peaks, period doubling, and depressed long time-scale amplitude.

Niche Modeling: Predictions from Statistical Distributions. Chapman & Hall/CRC, Boca Raton, FL., 2007.

I notice Craig Loehle converged on similar results in post about a publication on the Divergence Problem. In the abstract Craig finds a similar quantitative depression in the range of signal recovered.

If trees show a nonlinear growth response, the result is to potentially truncate any historical temperatures higher than those in the calibration period, as well as to reduce the mean and range of reconstructed values compared to actual.

By far the most interesting result I find is the introduction of ‘doubling’ from assuming a linear response of a non-linear variable. This is illustrated by Craig’s figure here:

Because over the course of one climate cycle, the tree passes through two optimal growth periods, the tree is, in electrical terms, a frequency doubler. This would create enormous difficulties in trying to detect major features such as Medieval Warm Periods and Little Ice Ages from such a responder.

But the problems do not end there. According to the latitudinal (or attitudinal) location of the tree, relative to its optimal growth zone, the location of the doubled peaks is shifted temporally. This shifting of the peaks is illustrated in the figure below, taken from my chapter.

If one then imposes two non-linear responses, such as temperature and rainfall, the response becomes even more choppy, as shown in another graphic from the chapter.

The recovery of a climate signal in the face of nonlinearity of response is fraught with difficulties. When the fundamental growth response of a trees, and all living things actually, is known to be a non-linear niche-like response, there is more onus on modelers to prove their methods are adequate.

While not unaware of the problem, most often in climate (and ecological) science, risky statistical prediction methods are used with inadequate validation, or like the drought modeling efforts by CSIRO here, results from GCMs are used with no attempt to demonstrate they are ‘fit-for-purpose’ at all. Rob Wilson argues at ClimatAudit that while linear modelling of tree-growth relationships is not ideal, the field is ripe for some fancy non-linear modelling. Given the range of exotic features introduced by non-linearities, as I showed above, I would argue that fancy non-linear modeling would probably lead more surely to self deception, and a better path is robust validation.

## 0 thoughts on “Assumptions for linear regression”

1. David,

My post on Dr. Loehle’s non-linearity was less thorough than yours. But I took another step and included a mannian CPS correlation sort/calibration on a known signal and set of tree ring matched ARMA proxies.

Toward the bottom of the page I look at inserting a known signal in 10000 ARMA proxies and used CPS to extract it.

You may already know so I apologize but what I have found is that the scaling function, applied individually for calibration of proxies creates a demagnification of the historic signal in relation to the calibration range. Simple averaging recovers the perfect artificial signal in this case but scaling individually for calibration creates a distortion of the temperature scale which is quantifiable.

I did this post some time ago, but it shows the compression of historic signals caused by rescaling in CPS.
http://noconsensus.wordpress.com/2008/10/08/id-goes-mythbuster-on-hockey-sticks-cps/

I have seen a number of hockey stick recreations from random data but I am not aware of any which insert a known and go look for it. My conclusion was that CPS cannot recover an undistorted signal.

2. David,

My post on Dr. Loehle’s non-linearity was less thorough than yours. But I took another step and included a mannian CPS correlation sort/calibration on a known signal and set of tree ring matched ARMA proxies.

Toward the bottom of the page I look at inserting a known signal in 10000 ARMA proxies and used CPS to extract it.

You may already know so I apologize but what I have found is that the scaling function, applied individually for calibration of proxies creates a demagnification of the historic signal in relation to the calibration range. Simple averaging recovers the perfect artificial signal in this case but scaling individually for calibration creates a distortion of the temperature scale which is quantifiable.

I did this post some time ago, but it shows the compression of historic signals caused by rescaling in CPS.
http://noconsensus.wordpress.com/2008/10/08/id-goes-mythbuster-on-hockey-sticks-cps/

I have seen a number of hockey stick recreations from random data but I am not aware of any which insert a known and go look for it. My conclusion was that CPS cannot recover an undistorted signal.

3. David L. Hagen says:

You might be interested in:
A. Lahellec, S. Hallegatte, J. -Y. Grandpeix, P. Dumas & S. Blanco, Feedback Characteristics of nonlinear dynamical systems, Europhysics Letters 81 (2008) 60001, March 2008, p1-6.

“We propose a method to extend the concept of feedback gain to nonlinear models. The method is designed to dynamically characterise a feedback mechanism along the system natural trajectory. The numerical efficiency of the method is proved using the Lorenz (1963) classical model. Finally, a simple climate model of water vapor feedback shows how nonlinearity impacts feedback intensity along the seasonal cycle.”

4. David L. Hagen says:

You might be interested in:
A. Lahellec, S. Hallegatte, J. -Y. Grandpeix, P. Dumas & S. Blanco, Feedback Characteristics of nonlinear dynamical systems, Europhysics Letters 81 (2008) 60001, March 2008, p1-6.

“We propose a method to extend the concept of feedback gain to nonlinear models. The method is designed to dynamically characterise a feedback mechanism along the system natural trajectory. The numerical efficiency of the method is proved using the Lorenz (1963) classical model. Finally, a simple climate model of water vapor feedback shows how nonlinearity impacts feedback intensity along the seasonal cycle.”

5. Pingback: www.london24h.co.uk

6. Pingback: organizacja eventów kulinarnych

7. Pingback: witryna

8. Pingback: witryna www

9. Pingback: kliknij

10. Pingback: strona www

11. Pingback: wlasny sklep internetowy