How are bimodal distributions created and modeled?

Demetris Koutsoyiannis responds:

I agree that a bimodal distribution is seldom seen. Well, my experience is not from ecological but mainly from hydrological processes but I suspect that the behaviours would be similar.

I have seen claims of bimodality several times but I was never convinced about them as I did not read any argument supporting it except empirical histograms. However, we must be aware that the uncertainty of the histogram peaks is large. A simple Monte Carlo experiment with say a normal distribution suffices to demonstrate that (unless the number of generated values is very high) it is very common to have a histogram with two, three or more peaks. This however is totally a random effect; obviously the normal density is unimodal.

So, I think that one must have theoretical reasons to accept a bimodality hypothesis. As a simple illustration, consider a system described by a random variable X, which switches between two well defined states, 1 and 2 with probabilities p and 1-p. Assume that the conditional density of X given the state is normal in each of states 1 and 2 and denote it f1(x) and f2(x), respectively. Then the unconditional density will be p f1(x) + (1-p) f2(x). It can be easily observed that if the means of the two densities are different, then certain combinations of the standard deviations and the probability p result in a bimodal unconditional density.

Advertisements

0 thoughts on “How are bimodal distributions created and modeled?

  1. There are a number of existing tests for the modality of a density underlying an observed distribution, including Silverman’s test, the Dip test, the Excess Mass test, and the MAP and RUNT tests.

    I agree with Demetris that the likely source of bimodality will be switching between distinct states, with “switching between states” broadly defined. Humans are probably bimodal in feature space (men and women being distinct enough to classify as one or the other with a high degree of accuracy) – the relevant state in this case being sex or its genetic proxy.

  2. There are a number of existing tests for the modality of a density underlying an observed distribution, including Silverman’s test, the Dip test, the Excess Mass test, and the MAP and RUNT tests.

    I agree with Demetris that the likely source of bimodality will be switching between distinct states, with “switching between states” broadly defined. Humans are probably bimodal in feature space (men and women being distinct enough to classify as one or the other with a high degree of accuracy) – the relevant state in this case being sex or its genetic proxy.

  3. Ann,

    I don’t know of any packages that include the option, but code for Hartigan’s Dip test can be found at http://lib.stat.cmu.edu/apstat/217. It is written in Fortran, but if you have programming experience (or know someone who does), it won’t be too difficult to modify it into a another language, if need be.

    When I used the code several years ago there was an error somewhere – something very simple like a parenthesis that wasn’t closed, or was closed in the wrong place. Unfortunately, I don’t remember where it was. It may have been fixed since then.

    You should also know that the p-tables provided by Hartigan & Hartigan in their (1985?) paper are very conservative. You can improve the power of the test by constructing a “best unimodal” estimate of the density underlying your observations, creating simulated samples (of the same size as the sample you wish to test) from that estimate, and running the Dip test on each sample to create a distribution of Dip values that occur under the best unimodal assumption. If the Dip for your observed values exceeds the x-th percentile (where x is your alpha) of that constructed distribution you can say that you have rejected unimodality. Computationally heavy, but possible.

    Well. I never thought I’d use that bit of knowledge. Hope it helps.

  4. Ann,

    I don’t know of any packages that include the option, but code for Hartigan’s Dip test can be found at http://lib.stat.cmu.edu/apstat/217. It is written in Fortran, but if you have programming experience (or know someone who does), it won’t be too difficult to modify it into a another language, if need be.

    When I used the code several years ago there was an error somewhere – something very simple like a parenthesis that wasn’t closed, or was closed in the wrong place. Unfortunately, I don’t remember where it was. It may have been fixed since then.

    You should also know that the p-tables provided by Hartigan & Hartigan in their (1985?) paper are very conservative. You can improve the power of the test by constructing a “best unimodal” estimate of the density underlying your observations, creating simulated samples (of the same size as the sample you wish to test) from that estimate, and running the Dip test on each sample to create a distribution of Dip values that occur under the best unimodal assumption. If the Dip for your observed values exceeds the x-th percentile (where x is your alpha) of that constructed distribution you can say that you have rejected unimodality. Computationally heavy, but possible.

    Well. I never thought I’d use that bit of knowledge. Hope it helps.

  5. Thanks for the information Morgan. I am surprised to hear of scarcity of packages.

    I was thinking, there is a bimodal distribution for every unimodal distribution — the negation. For example, if the distribution of a species is unimodal, most are modeled that way with temperature or rainfall, then the distribution of points where the species does not occur should be bimodal, shouldn’t it? That said, if you were to reverse the values of the dependent variables in a logistic regression, 0 for 1 and 1 for 0, how would that change the result?

  6. Thanks for the information Morgan. I am surprised to hear of scarcity of packages.

    I was thinking, there is a bimodal distribution for every unimodal distribution — the negation. For example, if the distribution of a species is unimodal, most are modeled that way with temperature or rainfall, then the distribution of points where the species does not occur should be bimodal, shouldn’t it? That said, if you were to reverse the values of the dependent variables in a logistic regression, 0 for 1 and 1 for 0, how would that change the result?

  7. David,

    I think you’re right. In one dimension (which I should have noted above is the only case in which the Dip test can be used) the existence of a unimodal distribution does imply a bimodal negation. In two dimensions (if I’m visualizing things correctly) it probably doesn’t, because the negation is likely to be a unimodal “donut” around a bivariate normal distribution, for example.

    I’m not sure I’m correctly understanding your use of the word “modeled” in the title of the thread. In my mind, a bimodal distribution would likely be modeled by a mixture of distributions chosen to reflect the observed data (or nonparametrically estimated by a kernal estimate, perhaps). I suspect that’s too static an interpretation for your purposes.

    Based on your reference to the distribution of species, it seems that you might be interested in a test that works for two dimensions (or more, if the species is aquatic, I guess). The MAP and RUNT tests (or the Excess Mass test, if progress has been made in the years since I last looked) may be appropriate there, though I don’t know of any instantiation of either of them. Silverman’s test was described for one dimension but can be extended to any number of dimensions, in Silverman’s words, “mutatis mutandis” (“presto change-o”?). I believe that all three of these can also be used to test for more that two modes (one versus more than one; that settled, two versus more than two etc.).

  8. David,

    I think you’re right. In one dimension (which I should have noted above is the only case in which the Dip test can be used) the existence of a unimodal distribution does imply a bimodal negation. In two dimensions (if I’m visualizing things correctly) it probably doesn’t, because the negation is likely to be a unimodal “donut” around a bivariate normal distribution, for example.

    I’m not sure I’m correctly understanding your use of the word “modeled” in the title of the thread. In my mind, a bimodal distribution would likely be modeled by a mixture of distributions chosen to reflect the observed data (or nonparametrically estimated by a kernal estimate, perhaps). I suspect that’s too static an interpretation for your purposes.

    Based on your reference to the distribution of species, it seems that you might be interested in a test that works for two dimensions (or more, if the species is aquatic, I guess). The MAP and RUNT tests (or the Excess Mass test, if progress has been made in the years since I last looked) may be appropriate there, though I don’t know of any instantiation of either of them. Silverman’s test was described for one dimension but can be extended to any number of dimensions, in Silverman’s words, “mutatis mutandis” (“presto change-o”?). I believe that all three of these can also be used to test for more that two modes (one versus more than one; that settled, two versus more than two etc.).

  9. Thanks for the test information Morgan. The model I was thinking of was a polynomial of the third degree, or cubic and whether they suffice to model a response surface.

  10. Thanks for the test information Morgan. The model I was thinking of was a polynomial of the third degree, or cubic and whether they suffice to model a response surface.

  11. Hi,

    sorry, I am not an expert in statistic but would like to compare at least two bimodal histograms. Do you suggest any particular test to do it? Thanks in advance.

  12. Hi,

    sorry, I am not an expert in statistic but would like to compare at least two bimodal histograms. Do you suggest any particular test to do it? Thanks in advance.

  13. Pingback: list your startup

  14. Pingback: crazy mass

  15. Pingback: follow url

  16. Pingback: find

  17. Pingback: makeanygirlwanttofuck

  18. Pingback: zobacz

  19. Pingback: wynajem opiekuna wroclaw

  20. Pingback: na cialo

  21. Pingback: Ryan Levin

  22. Pingback: darmowe anonse kobiet

  23. Pingback: link do strony

  24. Pingback: link do strony

  25. Pingback: polecam link

  26. Pingback: filmowanie Lublin

  27. Pingback: montaz sufitów mineralnych

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s