Predictive models for business analytics often involve

very complex data-mining and other statistical techniques.

Here is a simple, efficient way of predicting using images that reduces the

prediction process to its bare essentials.

All models are essentially generalizations —

simplifications into patterns that enable extrapolation into the unknown.

As such, one of the simplest forms of generalization

is categorization, where a large number of dissimilar items are sorted into a smaller number of bins, based on their similarity. Once a set of bins or categories is established, and there is a basis for deciding into which bin new items should go, new items can be categorized. In this way, a categorization, or clustering can serve as a predictive model. And, as categorization is

a basic operation producing an color palette for an image,

images can be used to develop models, and palette swaps used for prediction.

To see how clustering works, consider a basic clustering algorithm

available in R and other statistical languages

called *kmeans*. In *kmeans*, the data to be clustered

is partitioned into *k* groups such that the sum

of squares from points to the assigned cluster centers is minimized.

At the minimum, all cluster centers are at the mean of

the set of data points with the same category.

The similar operation in image processing is called color quantization

or color reduction. Reducing the number of colors is very useful,

and it will compress the size of an image albeit at the expense

of the quality of the image. But if the right colors are chosen

for the bins, the eye barely notices the difference.

Most image processing utilities will do this. For

example utility in *netpbm* for this purpose is called *ppmquant*: e.g.

ppmquant number_of_colors pnmfile

## Prediction

After producing a reduced set of colors or bins from the image,

palette swapping can provide the prediction. Palette swapping

replaces the set of colors in the image with a new set of colors.

The utility in the netpbm package is pamlookup, invoked with an image as

a lookup table for mapping the old colors to the new: i.e.

pamlookup -lookupfile=lookupfile -missingcolor=color [-fit] indexfile

Note that this is a very efficient operation as the data in the image

does not change, only the small set of values in the palette of the image.

For example, say we have an image that represents a pattern of

environmental values. For concreteness, the donut in Figure 3A could be the

vicinity of a ring-road, the edges of an urban area, or any other feature.

Say we predict that certain values are of interest, perhaps as

potential for future crimes. The frequency of those crimes is

a niche model, as shown by the peaked distribution in

probability over the environmental values

in Figure 2.

Swapping the colors in the original image with the new colors given

by the function in Figure 2 (i.e. mapping the values on the x axis to the

values on the y axis) changes Figure 3A into Figure 3B — essentially

producing a prediction of the probability of crime in the region. This illustrates predictions from a model using only palette swapping

on images. Because images can be stored and manipulated very

efficiently by most computers, predictive algorithms such as WhyWhere

using this approach can handle very large datasets (stored as images) very

efficiently.

Pingback: kliknij tutaj

Pingback: witryna

Pingback: strona www

Pingback: tutaj

Pingback: katalog

Pingback: oszczednoscionline.wordpress.com

Pingback: rating.blox.pl

Pingback: zobacz