The following is a new version of the About Page. Gradually getting this website organized the way I want it.
I have always been fascinated with prediction.
As an undergraduate I made stock predictors on the first PCs and lost money in 1987.
Studied maths, statistics and started a PhD in ecological prediction.
Developed betting systems and lost money.
Studied algorithms for predicting species distributions and developed GARP which other people used for cool things like finding new species of chameleon in Madagascar.
Developed automated trading systems for FOREX in 2002 and lost money.
So I know a few things about prediction, and more about how not to do prediction. In addition, in this blog I hope to pass on a few, and help people to predict better. Like predicting the risk to poultry from Bird Flu using GIS spatial analysis. Or monitoring the health of different types of hydrocoral polyps on reefs. The possibilities are endless.
The paper by S.J. Phillips, R.P. Anderson, and R.E. Schapire — A maximum entropy approach to species distribution modeling — introduces to niche modelers for the first time the Maximum Entropy approach well known in machine learning. They also provide the Maxent software for predicting species distribution using Maxent, and evaluate against a well know method called DesktopGARP in predicting the distribution of two Neotropical mammals, a sloth Bradypus variegatus and a rodent Microryzomys minutus.
The Maxent principle is to estimate the probability distribution, such as the spatial distribution of a species, that is most spread out subject to constraints such as the known observations of the species. Maxent uses entropy as the means to generalize specific observations of presence of a species, and does not require or even incorporate absence points within the theoretical framework. Presence-only points are observations of the presence of a species. For a variety of reasons, absence of a species is not usually recorded.
Finally, one journalist has the message right: Duane Freese in his article — â€œHockey Stick Shortened?â€ — at TechCentralStation reports on the National Academy of Sciences report “Surface Temperature Reconstructions for the Last 2,000 Years“. Repetition of the consensus view of strong evidence of recent global warming is not newsworthy. Increase in the uncertainty of the Millennial temperature record is. He says:
The most gratifying thing about the National Academy of Science panel report last week into the science behind Michael Mann’s past temperature reconstructions – the iconic “hockey stick” isn’t what the mainstream media have been reporting — the panel’s declaration that the last 25 years of the 20th Century were the warmest in 400 years.
The hockey stick, in short, is 600 years shorter than it was before and the uncertainties for previous centuries are larger than Mann gave credence. And when the uncertainty of the paleoclimatogical record increases with time, the uncertainty about human contribution is likewise increased. Why? For a reason noted on page 103 of the report: climate model simulations for future climates are tuned to the paleoclimatogical proxy evidence of past climate change.
The trial of Hwang Woo Suk has begun, and it would be a good time to clarify some of the issues to be adjudicated.
Hwang was indicted on charges of fraud, embezzlement and bioethics violations. Hwang has admitted ethical lapses in human egg procurement for his research.
As to embezzlement charges, prosecutors say Hwang used bank accounts held by relatives and subordinates in 2002 and 2003 to receive contributions from private organizations, laundered the money by withdrawing it all in cash, breaking it up into smaller amounts and putting it back in various bank accounts. They claim he bought gifts for his sponsors, politicians and other prominent social figures, and bought a car for his wife.
But it is the fraud charges that are interesting. Prosecutors say they will take no separate action over the fabrication of data in Hwangâ€™s published research. Prosecutors say:
â€œThere is no precedent in the world where someone has been punished for fabricating research results, and the matter should be left to academic mechanisms.â€
What is ‘results management’?
Accountants and auditors are often concerned with various kinds of alteration of figures, a practice euphemistically called ‘earnings management’. For example in “An Assessment of the Change in the Incidence of Earnings Management around the Enron-Andersen Episode” – Mark Nigrini
In 2001 Enron filed amended financial statements setting off a chain of events starting with its bankruptcy filing and including the conviction of Arthur Andersen for obstruction of justice. Earnings reports released in 2001 and 2002 were analyzed. The results showed that revenue numbers were subject to upwards management. Enronâ€™s reported numbers are reviewed and these show a strong tendency towards making financial thresholds.
Benford’s law which is a conjecture concerning the expected frequency of digits in unmanaged data, is useful for detecting fraud and other forms of results management. I have posted on some results applied to time series data here, and here. An R module used for analysing these time-series data is available from this site here.
The NAS report has been chastised here and here for concluding that it is â€œplausibleâ€ that the â€œNorthern Hemisphere was warmer during the last few decades of the 20th century than during any comparable period over the preceding millenniumâ€ while at the same time conceding that every statistical criticism of MBH is correct, disowning MBH claims to statistical skill for individual decades and years, and finding little confidence in reconstructions of surface temperatures from 1600 back to A.D. 900, and very little confidence in findings on average temperatures before then.
One of the main justifications for this plausibility was the “general consistency” of other studies.
The committee noted that scientists’ reconstructions of Northern Hemisphere surface temperatures for the past thousand years are generally consistent. The reconstructions show relatively warm conditions centered around the year 1000, and a relatively cold period, or “Little Ice Age,” from roughly 1500 to 1850. (NAS press release)
I want to show to what a vacuous motherhood statement this is. Previously I have shown that virtually any data, including random data, will produce a graph similar to the existing studies if you ‘cherry pick’ proxies for correlations with temperature and then you squint your eyes a bit here, here, and here. Today I thought I would run a simple statistical test see how consistent each of the reconstructions really is.
Below I have plotted up each of the reconstructions and CRU temperatures using a lag plot. A lag plot shows the value of a time series against its successive values (lag 1). A lag plot allows easy discrimination of three main types of series:
- Random – shown as a cloud
- Autocorrelated – shown as a diagonal, and
- Periodic – shown as circles.
Millennial temperature reconstructions on a lag plot.
The play by Rolin Jones “The Intelligent Design of Jenny Chow” is a tale of science fueled by post-adolescent angst with a brilliant young woman who excels at rocket science but canâ€™t leave her bedroom. Driven by a real life quest to find her biological mother, she pilfers parts from her government rocket project and builds a replica of herself, named Jenny Chow to meet her real birth mother in China, by proxy.
The publicity photo for “The Intelligent Design of Jenny Chow,” an award-winning play by recent School of Drama graduate Rolin Jones.
Jenny’s robot is contemporary version of the Simulacrum, that creation of device for a purpose in a parody of reality. Such appears to be the case of the ‘hockey stick graph’ — a reconstruction of Millennial global temperatures based on tree-ring and other proxies by MBH98. The graph, showing temperatures relatively stable throughout the middle ages and recent times, and shooting upward suddenly last century has been an iconic feature of the IPCC 2001 report on climate change, and countless government reports, slide shows, powerpoints and papers since its invention.