Bayesian Experimental Design

Oct 12, 2018

This semester I've been taking an experimental design class. Experimental design is the specification of all aspects of an experiment:

which treatments to study
choosing blocking factors
choosing how to randomize
specifying experimental units
choosing treatment allocation

But how do we select those values? This is important because you can improve estimation of particular quantities by carefully designing aspects of the experiment.

As one of the final tasks of the course we were to give a talk about a relevant paper. As a fervent Bayesian, I decided to look at Bayesian Experimental Design: A Review by Chaloner and Verdinelli (1995).

The main idea of the Bayesian approach to experimental design (and to Bayesian inference in general) is that we should take advantage of prior information when avaiable.

Consider the following setup:

We have some set of potential designs \(\eta\)
For each design \(\eta\) we would observe data \(y\) based upon the design as well as some unknown parameters \(\theta\).
We then make some decision \(d\) based upon \(y\) observations.
Finally we'll have some measure of utility \(U(d, \theta, \eta, y)\) which depends upon all of these quantities.

Now that we have a utility the most natural thing to do from a decision-theory perspective is to optimize the expected utility.

\[ \int_{y} \int_{\theta} U(d, \theta, \eta, y) p(\theta \mid y, \eta) p(y \mid \eta) d\theta dy \]

So our best decision holding everything else fixed is whatever \(d\) maximizes this expression. Likewise, the best design is the whatever \(\eta\) maximizes this expression (when paired with the optimal \(d\) for that particular \(\eta\)).

An important note is that the utility function should be chosen to suit the goal of the experiment. For instance, there's specific utility functions for experiments designed to simply obtain information about the unknown parameters vs experiments designed to perform a particular test vs fit a model which predicts well. This is more satisfying than some of the classical designs taught in the class which did not seem tailored towards specific purposes so much as satisfying generally desireable criterion.

You'll note that I mention nothing about actually calculating the expected utility. This seems to be a major limitation in using Bayesian experimental design more broadly. It would be interesting if there were good software for flexibly designing Bayesian experiments; as it is, there seems to only be focus on linear models where you can calculate the expected utility in closed form.

And that's basically all you need to know. One thing that I love about Bayesian methods is that they're conceptually concise. Once you know a few basic ideas you can apply them a wide array of problems. This is in contrast to the traditional experimental design literature which gave the impression of lacking cohesion. Although before I chalk up another victory for the Bayesian school I should probably consider that the vast majority of current scientific designs use the traditional methods. So perhaps there's a beauty/practicality tradeoff to consider.

Filed under: statistics

« Notes on "The Superiority of Simple Alternatives to Regression for Social Science Prediction"