There are a number of ways to answer this question.

There are a rich diversity of methods to predict

species’ distribution and they could be listed and described.

Alternatively, the biological relationships between

species and the environment could be emphasized, and approaches from

population dynamics used as a starting point.

A more general approach to niche modeling can be based the

statistical idea of the probability distribution.

**Definition:** A niche model is a probability distribution defined on environmental

variables.

**Definition:** A probability distribution *f(E)* is an assignment of a probability

to every interval on a set of environmental variables *E*.

This definition of the niche as a probability distribution

over sets of environmental variables allows for

developing niche models in new ways over

new entities.

Based on this definition, the ‘entity’ being

modeled is probabilistic, not an

actual physical quantity such as population density of animals or group of plants.

Thus, the object of the modeling

is similar to a quantum entity — in the realm of possibility

rather than actuality. Niche models often describe

fairly vague concepts, such as habitat suitability.

Nevertheless they are useful if

one is careful not to carry the metaphor too far, partly because

the fundamental constraints that govern microscopic physical

system, such as conservation of energy laws, do not hold.

Niche models are sometimes called equilibrium models,

as generally the niche represents a stable relationship

of a species to its environment.

Stability in this sense refers to the overall stability of a population despite

non-equilibrium disturbances such as annual cycles and episodic threats,

For example, the processes that lead to expansion of the

range of the species balance the processes that lead to contraction

and result in an equilibrium.

But equilibrium conditions are not a necessary assumption

to develop these models.

Any form of reasonably ‘stable’ probability distribution can be used

to make a niche model.

For example, while migrating species move in relation to their

environment, it has been shown that many are ‘niche followers’

by remaining in a fairly constant climate as the seasons change.

Invasive species are another example of species not at ‘equilibrium’

but generally only spreading to similar environmental niches to

those occupied in their host country.

## Types of niche model

### Huchinsonian

The quantitative basis of niche modeling really lies

in the Hutchinsonian definition of a niche. This was

described as a region in ‘hyperspace’ of *n* dimensional

environmental variables where a critter lives. Developing

a model using the ideas of Hutchinson

simply requires defining the range of the species along the axes

of the set of environmental variables.

This approach was used in BIOCLIM, one of the first niche modeling

tools first used in an early study of the distribution

of snakes in Australia by Henry Nix.

### Environmental Envelope

This approach was very intuitively convincing, and captures the sense of a niche

as understood by ecologists: that the occurrence of species should be limited by

a range of environmental factors, and that an envelope around

those ranges would have predictive utility.

However, the approach runs into some practical problems.

Firstly, there is no way to exclude irrelevant variables from being included in the model.

The range of irrelevant variables is determined more by chance

that any causal relationship. Yet those irrelevant variables will act as

constraints outside the range.

Secondly, in environmental envelopes the limits of the species are defined

largely by the tails of the

probability distribution. The tails of a probability distribution

usually have the least probability, the

least numbers of samples, and hence the greatest uncertainty.

One way of reducing the variability of the range limits

is to estimate the 95 percentile.

But this approach produces a progressive reduction in ecological area

with each variable. As the number of variables increase, the

potential area is reduced. To overcome this problem, models have

been proposed based on the mean and standard deviation.

Thirdly the environmental envelopes defined by the limits

of each variable in turn form squares or box-like shapes.

Alternative geometries have been proposed to correct these problems, by

allowing more flexible geometric shapes to

describe the distribution.

### Generalized linear models

While the above approaches to correcting the deficiencies of environmental envelopes

let to some improvements, there is an essential component missing that was

a concern for more statistically minded researchers. Environmental

envelopes do not explicitly estimate probability.

That is, while they define a region in space, the variation in probability

within that region is undefined.

Logistic regression was used to place niche

modeling on a firmer statistical footing. Logistic regression is

a form of linear regression modeling where the dependent

variable, the variable to be estimated from the environmental

variables is specifically probability and not abundance or other variable.

Logistic regression models are a

well studied and understood statistical methodology.

While the introduction of statistical rigor is to be preferred over

ad hoc approaches, there were still problems identified.

One of the first was called ‘naughty naughts’.

The ‘naughty naughts’ referred to the great many areas

with essentially zero probability, such as oceans for a terrestrial species.

Most probability distributions are continuous with finite (though very small)

probability over the whole range. The need to eliminate the naughts,

leads to the use of truncated

and other more complex probability distributions

in an attempt to fit the expected shape of the probability distribution.

Actually, the problem of finding the best shape for a distribution of a species

is not trivial and cannot be taken for granted.

Species distributions are not necessarily ‘normal’ and there can be

good ecological reasons for highly unusual distributions.

They can be skew, bimodal, exponential or sigmoidal.

They can also have long tails. Both justifying the shape of the distribution and

modeling with the range of possible distributions involves difficult

and challenging statistical tests using classical statistical approaches.

Secondly the treatment of categorical variables such as

vegetation types, ecological regions, and so on, is problematic. Logistic

regression usually handles categorical variables, by treating each

category as a binary variable. For example, if a variable has 100 categories, then

this would produce 100 new variables. However, with more categories

and more variables the number of variables that would need to be introduced

is enormous. Logistic regression would be a method of choice with well

behaved distributions of largely continuous variables.

### Machine learning methods

Due in part to the problems posed by categorical variables,

and essentially arbitrary distributions, machine

learning was seen as potentially applicable to niche modeling.

Machine learning methods have been used in a variety of

problems where the there were no exact analytical solutions.

The popular early methods: decision trees and neural nets

were tried and found useful. The GARP approach was an attempt to meld the three traditional

approaches in a genetic algorithm that evolved a set of solutions

consisting of environmental envelopes, logistic regression and categorical rules.

The idea of a genetic algorithm is to generate a set of rules for each type of relationship

and then iteratively test and refine them until a stable solution is achieved, letting the best rules win.

This approach was intended to capture complex heterogeneous types of

relationships of species to the environment.

Although machine learning methods have the

requirement that they estimate the probability,

all these methods were problematic to interpret in terms of

a Hutchinsonian niche, although people did try.

Another drawback was that some required multiple runs and

were computationally intensive.

Nevertheless, the development of these machine learning methods has progressed

and many are giving very good results that clearly exceed the

classical approaches. However the difficulty in interpreting

the results in terms of ecological theory remains, as do potential limitations

with large sets of variables.

### Data mining

Data mining is the automated search for patterns in large amounts of data.

A couple of aspects of niche modeling make data mining potentially useful.

Firstly, as often little is known about the factors determining species’

distributions, we don’t know what factors will be most accurate at predicting the species.

Because of this uncertainty, we can’t always apply annual averages of temperature and rainfall and

expect to get a good model.

For example species in freshwater and marine environments

are not well modeled by annual climate factors, and as the popularity of niche modeling

grows more entities in exotic environments will be of interest.

Data mining makes it possible to test a large number of datasets as

potential candidates for models.

Secondly there is lot more data available now than there was — a

factor described in a following chapter.

The philosophy behind a data mining approach to niche modeling allows

for minimal assumptions to be made about the type of variables

and the form of the probability distribution that can potentially be used

in a model. Also, an approach that allows virtually any variable to be used,

opens analysis up to modeling potentially anything.

To a large extent the only difference between niche modeling

and data mining is the number of independent variables. Approaches whereby models

are developed with all variables simultaneously cannot usually be used with large

data sets due to memory limitations. Generally a sequential approach

to including variables in the model is needed. Data mining also needs

to robustly discover information about a range of types of data with

arbitrary statistical distributions.

One of the most popular approaches used in data mining is the

induction of decision trees.

Mentioned before under machine learning methods,

decision tree methods have continued to be improved

with the use of more complex algorithms for

improving robustness.

Another approach to data mining is clustering.

Often used as an exploratory method of data analysis,

methods such as *k-means* quantize variables into

a discrete number of groups, and characterize the

points in the groups by representative features, such

as the group centroids.

The statistics of k-means and decision tree methods have been

well researched, and in comparison to more heuristic

methods are well understood. They been used successfully in a variety

of fields.

A clustering approach is used in the WhyWhere algorithm.

Here an image processing method is used to derive the groups

from up to three environmental variables at once, characterized as

the list of reduced colors. Efficient approximate implementations of k-means are

used for the color reduction. The method used, Heckbert’s median cut,

is used in GIF and other image formats to compress their size,

and has been proven to give good results for images.

In clustering approaches, probabilities for

prediction at a specific point are derived from a single probability at each

cluster. These can simply be the cluster the point belongs too, or

a weighted sum of probabilities at a number of clusters.

In WhyWhere the probabilities of presence or absence are calculated from

the proportion of occurrences of the points in a group

relative to the proportion of environmental values in that category.

## Summary

Here we presented a brief overview of the background to niche modeling methods,

the strengths and limitations of various approaches and the status of contemporary

trends. The next chapter discusses the other most important component of

ecological niche modeling — the environmental data.

I like to come here, and read these clear reviews, which are especially usefull for beginer, just like me. Hope the next chapter will be come soon.

Thanks for the feedback Lisong. You can come back anytime! The content of the environmental data chapter is here.

Pingback: zobacz

Pingback: www.hotel-krakow.co.uk

Pingback: list your startup

Pingback: massage paris 16

Pingback: How to get a girlfriend

Pingback: strona firmy

Pingback: witryna

Pingback: kliknij link

Pingback: oferta

Pingback: witryna

Pingback: massage body body

Pingback: zobacz tutaj

Pingback: Look Inside Venus Factor

Pingback: anonse kobiet

Pingback: see inside

Pingback: polecam

Pingback: szybkieoszczedzanie.bloog.pl

Pingback: their website

Pingback: london pick up artist

Pingback: Bittu

Pingback: polecam link