# Three Variable Bayes Net for Species Prediction

The post Bayesian Networks introduced this useful and flexible form of modeling. Here is an example of a Bayesian Belief Net or BBN model of a simple three variable species prediction system.

In Fig 1 the top node is habitat quality for the species. Two lower nodes, the average temperature (Av_Temp) and the vegetation (Veg_Type) at the site, represent simple environmental variables, related to presence of the species. In Av_Temp, the probability of the species being present is a bell shaped curve with a peak at 15 C. Values 0 to 4 represents vegetation types, 0 and 1 where species may be present.

The Classifier node represents the ecological niche model or ENM, and predicts the probability of presence or absence of the species based on the environmental nodes represented as a two dimensional matrix of probabilities, one for each combination of the environmental variables. The Result node, is based on the predictions and the actual values: correct, omission where the species is predicted to be absent but is present, and commission where the species is predicted to be present but is absent. Values for the probabilities in the tables have been entered by hand, but actual data could be used to calculate the probability tables.

Figure 1. A simple BBN used for prediction. When the values of climate and vegetation type are given, the values are propagated forward to infer the presence or absence of the species (Classifier), and the favourability of habitat (Habitat).

BBNs can perform the task of an ENM, predicting the probability of presence or absence of a species based on environmental information. The values of variables Av_Temp and Veg_Type are defined (probability 100). Given these environmental values, the Classifier predicts presence. The Result variable shows the expected accuracy of the classifier under these conditions (84.9%, while 15% of the time, commission error occurs (Classifier predicts presence, but habitat is unfavorable).

Figure 2. The Simple BBN used for investigating the source of errors. The Result variable is set to Omission (100) and changes in probabilities are propagated backwards throughout the net, indicating the values of Av_Temp where omission errors are greatest.

To see the source of errors the Result variable is set to Omission (100). The network updates values of all variables automatically (Figure 2). Classifier predicts absent but Habitat unfavorable, the definition of omission errors. The expected value for vegetation is 0 or 1. The expected value for Av_Temp is either 5 to 10 or 20 to 25. Thus the system shows that omission errors occur at the extremes of the temperature range of the species.