This is an situation that causes researchers to react in different ways. Some grapple with it, others exploit it, many avoid it. If you have experience with controversial research, your comments will be summarized and posted so others can benefit.

# Monthly Archives: August 2006

# Niche business becoming a commodity.

Niche products, niche businesses, niche markets — small businesses are often qualified as surviving in niches, finding their niche, defining or redefining their niche. Most people think entrepreneurs start their own business, but most buy running ventures, like franchises, already in surviving niches. Steps to understanding and predicting quantitative business niches are in early stages.

Here are some sources of numeric information that could be used. Web listing of businesses for sale on the web have improved the tradability of all assets, including businesses themselves to the point they are becoming more like commodities. Here are some sources of numeric information.

Businesses for Sale “Connecting Buyers and Sellers of Businesses” is an international source of businesses for sale.

BizBuySell is “The Internet’s Largest Business For Sale Marketplace”. This site includes ‘fire sales’ of assets, and business liquidations. Includes cost, gross and net return data.

BizQuest “A business for sale marketplace” is a marketplace of 1000s of businesses for sale and includes resources for buying a business or selling your business online. Includes cost, gross and net return data.

NicheGeek.com “Showing you what others overlook” posts off the wall articles on a variety of different topics and tracks how well they do with AdSense in terms of per click averages.

Business Nation “Small business website: information & ideas for starting a small business or home business startup, small business start-up website links, business start-up info…” covers all bases.

Of course Yahoo Directory lists a wide range of resources.

Any more good ones?

# How are bimodal distributions created and modeled?

Demetris Koutsoyiannis responds:

I agree that a bimodal distribution is seldom seen. Well, my experience is not from ecological but mainly from hydrological processes but I suspect that the behaviours would be similar.

I have seen claims of bimodality several times but I was never convinced about them as I did not read any argument supporting it except empirical histograms. However, we must be aware that the uncertainty of the histogram peaks is large. A simple Monte Carlo experiment with say a normal distribution suffices to demonstrate that (unless the number of generated values is very high) it is very common to have a histogram with two, three or more peaks. This however is totally a random effect; obviously the normal density is unimodal.

So, I think that one must have theoretical reasons to accept a bimodality hypothesis. As a simple illustration, consider a system described by a random variable X, which switches between two well defined states, 1 and 2 with probabilities p and 1-p. Assume that the conditional density of X given the state is normal in each of states 1 and 2 and denote it f1(x) and f2(x), respectively. Then the unconditional density will be p f1(x) + (1-p) f2(x). It can be easily observed that if the means of the two densities are different, then certain combinations of the standard deviations and the probability p result in a bimodal unconditional density.

# What is the theory of best linear predictions in tree breeding (and how does it apply to niche modeling)?

The third ecological question for today ventures into the field of evolutionary biology. Any answers will be posted.

# What are the system dynamics that lead to ecological niche models?

Here is the first of a series of ecological modeling questions? Post an answer in the comments and I will transfer it up to the post body.

# What are the conditions for valid extrapolation of statistical predictions? Answer II.

Before I attempt to describe my answer, I would like to do some clarifications on the nature of a statistical prediction and mention some points than need caution.

1. A statistical prediction should be distinguished from a deterministic prediction. In a deterministic prediction some deterministic dynamics of the form y = f(x1, â€¦, xk) are assumed, where y is the predicted value, the output of the deterministic model f( ), and x1, â€¦, xk are inputs, i.e. explanatory variables. The model f( ) could be either a physically based one or a black box, data driven one. The latter case is very frequent, e.g. in local linear (chaotic) models and in connectionist (artificial neural network) models.

Now in a statistical prediction we assume some stochastic dynamics of the form Y = f(X1, â€¦, Xk, V). There are two fundamental differences from the deterministic case. The first, apparent in the notation (the upper-case convention), is that the variables are no more algebraic variables but random variables. Random variables are not numbers, as are algebraic variables, but functions of the sample space. This is very important. The second difference is that an additional random variable V has been inserted in the dynamics. This sometimes is regarded as a prediction error that could be additive to a deterministic part, i.e. f(X1, â€¦, Xk, V) = fd(X1, â€¦, Xk) + V. However, I prefer to think of it as a random variable manifesting the intrinsic randomness in nature.

# What are the conditions for valid extrapolation of statistical predictions? Answer I.

Thanks to Martin Ringo for this answer.

Let me offer a little of the terminology of forecasting, which, I hope, will make the question clearer. When you are forecasting from some kind of structural model, say Y = f(X1, â€¦, Xk), there is a difference in whether you have to forecast the Xs as well as the Y. If you donâ€™t, it is an unconditional forecast; if you do, it is conditional. For an unconditional forecast, the inference is a pretty straightforward exercise of the classical linear model, at least if you structure relationship is so estimated and a nonlinear version for nonlinear estimations. For a conditional forecast, life can be messy since you have to take into account the distribution of the exogenous, the Xs, as well as the error term.

I recently reviewed a forecast where the analyst treated a conditional forecast like an unconditional one, and consequently underestimated the forecast error by over a factor of two.

Continue reading

# Was there an Australian medieval warm period?

The question for today. Does anybody know of any unprecedented warmth down under?

# How to regress a stationary variable on a non stationary variable? Answer II

Demetris Koutsoyiannis responds:

I think that such questions should not be treated in an algorithmic manner and that it is important to formulate them in the clearest and most consistent manner.

So, let us assume that we have a nonstationary stochastic process X(t) and a stationary process Y(t); I have interpreted here â€œvariableâ€ as process because the notion of stationarity/nonstationarity is related to a (stochastic) process, not a variable. Is the question, how to establish a regression relationship between Y(t) and X(t)? For instance a relationship of the form Y(t) = a(t) X(t) + V(t), where a(t) is a deterministic function of time and V(t) a process independent or uncorrelated to X(t)? Without going into detailed analysis it seems to me that in such a relationship it is difficult to have a constant a(t) = a, i.e. independent of time. Also, V(t) should be nonstationary too. So, while we can consider a time series (observations) of the stationary Y(t) as a statistical sample, we cannot do the same for X(t) or V(t). So, I doubt if there is a statistical procedure to infer a(t) and the statistical properties of V(t) (mean, variance, etc.) which are functions of time too. In addition, I do not find such a relationship useful at all.

# How to regress a stationary variable on a non stationary variable? Answer I

Martin Ringo responds:

This is the wrong question. The analyst shouldnâ€™t be worried about whether the dependent or independent is stationary or non-stationary. The issue is the error term.

In the Box-Jenkins procedure(s) â€” or maybe I should call it paradigm â€” the non-stationary stuff is removed. To me that removal is what is interesting, and all the stuff that Messrs. Box and Jenkins do is the treatment of serial correlation. But be you structural econometrician or time-series statistician, you can merrily regression a stationary variable on a non-stationary variable. You merely have to recognize that there is no impunity in regression. So you still have to check the residuals to see if they behave in a roughly white noise manner.