Skip Navigation


The British Journal for the Philosophy of Science Advance Access originally published online on September 17, 2007
The British Journal for the Philosophy of Science 2007 58(4):709-754; doi:10.1093/bjps/axm033
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
58/4/709    most recent
axm033v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Dowe, D. L.
Right arrow Articles by Oppy, G.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Copyright © The Author 2007. Published by Oxford University Press on behalf of British Society for the Philosophy of Science.

Bayes not Bust! Why Simplicity is no Problem for Bayesians1

David L. Dowe

Clayton School of Information Technology, Monash University, Clayton VIC. Australia 3800, http://www.csse.monash.edu.au/~dld

Steve Gardner

School of Philosophy and Bioethics, Monash University, Clayton VIC. Australia 3800

Graham Oppy

School of Philosophy and Bioethics, Monash University, Clayton VIC. Australia 3800 www.arts.monash.edu.au/phil/staff/oppy/index.html

Steven.Gardner{at}arts.monash.edu.au


   Abstract

The advent of formal definitions of the simplicity of a theory has important implications for model selection. But what is the best way to define simplicity? Forster and Sober ([1994]) advocate the use of Akaike's Information Criterion (AIC), a non-Bayesian formalisation of the notion of simplicity. This forms an important part of their wider attack on Bayesianism in the philosophy of science. We defend a Bayesian alternative: the simplicity of a theory is to be characterised in terms of Wallace's Minimum Message Length (MML). We show that AIC is inadequate for many statistical problems where MML performs well. Whereas MML is always defined, AIC can be undefined. Whereas MML is not known ever to be statistically inconsistent, AIC can be. Even when defined and consistent, AIC performs worse than MML on small sample sizes. MML is statistically invariant under 1-to-1 re-parametrisation, thus avoiding a common criticism of Bayesian approaches. We also show that MML provides answers to many of Forster's objections to Bayesianism. Hence an important part of the attack on Bayesianism fails.

  1. Introduction
  2. The Curve Fitting Problem
    2.1 Curves and families of curves
    2.2 Noise
    2.3 The method of Maximum Likelihood
    2.4 ML and over-fitting

  3. Akaike's Information Criterion (AIC)
  4. The Predictive Accuracy Framework
  5. The Minimum Message Length (MML) Principle
    5.1 The Strict MML estimator
    5.2 An example: The binomial distribution
    5.3 Properties of the SMML estimator
    5.3.1  Bayesianism
    5.3.2  Language invariance
    5.3.3Generality
    5.3.4  Consistency and efficiency

    5.4 Similarity to false oracles
    5.5 Approximations to SMML

  6. Criticisms of AIC
    6.1 Problems with ML
    6.1.1  Small sample bias in a Gaussian distribution
    6.1.2  The von Mises circular and von Mises—Fisher spherical distributions
    6.1.3  The Neyman–Scott problem
    6.1.4  Neyman–Scott, predictive accuracy and minimum expected KL distance

    6.2 Other problems with AIC
    6.2.1  Univariate polynomial regression
    6.2.2  Autoregressive econometric time series
    6.2.3  Multivariate second-order polynomial model selection
    6.2.4  Gap or no gap: a clustering-like problem for AIC

    6.3 Conclusions from the comparison of MML and AIC

  7. Meeting Forster's objections to Bayesianism
    7.1 The sub-family problem
    7.2 The problem of approximation, or, which framework for statistics?

  8. Conclusion
  1. Details of the derivation of the Strict MML estimator
  2. MML, AIC and the Gap vs. No Gap Problem
    B.1 Expected size of the largest gap
    B.2 Performance of AIC on the gap vs. no gap problem
    B.3 Performance of MML in the gap vs. no gap problem


1 "The title is the third in a sequence: the title of (Earman [1992]) asked ‘Bayes or Bust?’; in the title of his review of Earman's book in the pages of this journal, Forster ([1995]) affirmed that ‘Bayes and Bust’. Now comes our dissent: ‘Bayes not Bust!"


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
The Computer JournalHome page
D. L. Dowe
Foreword re C. S. Wallace
The Computer Journal, September 1, 2008; 51(5): 523 - 560.
[Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.