Here is a summary of the chapters in my upcoming book Niche Modeling to be published by CRC Press. Many of the topics have been introduced as posts on the blog. My deepest thanks to everyone who has commented and so helped in the refinement of ideas, and particularly in providing motivation and focus.
Writing a book is a huge task, much of it a slog, and its not over yet. But I hope to get it to the publishers so it will be available at the end of this year. Here is the dustjacket blurb:
Through theory, applications, and examples of inferences, this book shows how to conduct and evaluate ecological niche modeling (ENM) projects in any area of application. It features a series of theoretical and practical exercises in developing and evaluating ecological niche models using a range of software supplied on an accompanying CD. These cover geographic information systems, multivariate modeling, artificial intelligence methods, data handling, and information infrastructure. The author then features applications of predictive modeling methods with reference to valid inference from assumptions. This is a seminal reference for ecologists as well as a superb hands-on text for students.
Part 1: Informatics
Functions: This chapter summarizes major types, operations and relationships encountered in the book and in niche modeling. This and the following two chapters could be treated as a tutorial in the R. For example, the main functions for representing the inverted â€˜Uâ€™ shape characteristic of a niche — step, Gaussian, quadratic and ramp functions â€“ are illustrated in both graphical from and R code. The chapeter concludes with the ACF and lag plots, in one or two dimensions.
Data: This chapter demonstrates how to manage simple biodiversity databases using R. By using data frames as tables,
it is possible to replicate the basic spreadsheet and relational database operations with Râ€™s powerful indexing functions.
While a database is necessary for large-scale data management, R can eliminate conversion problems as data is moved between systems.
R and image processing operations can perform many of the
elementary spatial operations necessary for niche modeling.
While these do not replace a GIS, it demonstrates that generalization of arithmetic concepts to images can be implemented simple spatial operations efficiently.
Part 2: Modeling
Theory: Set theory helps to identify the basic assumptions
underlying niche modeling, and the relationships and constraints between these
assumptions. The chapter shows the standard definition of the niche as
environmental envelopes is equivalent to a box topology. It is proven that when
extended to infinite dimensions of environmental variables this definition
loses the property of continuity between environmental and geographic spaces.
Using the product topology for niches would retain this property.
Data: Management of
data for niche modeling is poorly served by user-developed files stored in a local directory.
A wide variety of data sets are currently available, and better quality niche modeling will result from using data in true archives —
shared by many studies and trusted with the highest level of quality.
A number of sources of data are described and access issues discussed.
Examples: The three examples of niche models here were
selected to contradict three main misconceptions of niche modeling.
The house price increase example shows a niche that is not limited to ‘inverted U’s’ and here is a bimodal distribution. The second
example of the brown treesnake shows an asymptotic distribution.
The third example of the zebra mussel shows how dynamic models
of the spread of invasive species can be developed from probability
distribution, contrary to the view that niche models are restricted to equilibrium approaches.
Part 3: Errors
Bias: Here we develop a simple theoretical model of the range-shift model widely used to predict the effect of climate change on species. We show how to use the model to estimate the magnitude of bias in range area estimates under
climate change scenarios.
Autocorrelation: This chapter shows the problem of validating models on autocorrelated data using internal or external validation. Holding back data at random is inadequate to determine the skill of a model when the data are autocorrelated, particularly when using smoothed data.
Nonlinearity: Procedures with linear assumptions are not reliable
when the responses are non-linear. Here using a linear model for reconstructing
past temperatures, non-linear tree response cause artifacts including
signal degradation, loss of variance, temporal shifts in peaks, and
Long Term Persistence: The world is more uncertain and more indeterministic than modeled using classical statistics. Here we show evidence that temporal and spatial natural series display LTP, or scale invariant distributions. These results provide no justification for models with preferred spatial or temporal scale, which greatly underestimate confidence limits.
A major source of error result from circular reasoning is due to conclusions
already encoded into the assumptions of the methodology, so allowing
no other conclusion that the one obtained. Using a Monte Carlo approach, by generating a set of random variables with the same noise and autocorrelation
properties as the variables of interest, and developing a null model based on those variables, one can obtain both benchmarks for rejection regions, and expectations arising from hidden model assumptions.
Fraud: The accidental or fraudulent management of results can be
detected using the methods similar to the distributional methods of niche
modeling. The second digit distribution postulated by Benford’s Law, allows
detection of fabricated data in natural time series drawn from a single distribution. The approach is applied to a range of natural data.