R snippets: Possible error in Bayesian bootstrap

Thursday, November 8, 2012

Possible error in Bayesian bootstrap

After my last post on Bayesian bootstrap I got a question why the sample from Dirichlet distribution is taken as weights for calculating mean in the procedure and not as weights used for sampling from the original data set. Actually this mistake is subtle and occurs even in textbooks, see example Chernick (2008), page 122. In this post I want to clarify the issue.

In the example I give correct bootstrap and Bayesian bootstrap procedures and wrong ones. The wrong Bayesian bootstrap follows description from Chernick (2008), page 122 (that is equivalent to the comment to my last post).

Here is the code that I used:

library(gtools)

ok.mean.bb <- function(x, n) {

apply(rdirichlet(n, rep(1,length(x))), 1, weighted.mean, x = x)

}

ok.mean.fb <- function(x, n) {

replicate(n, mean(sample(x, length(x), TRUE)))

}

wrong.mean.bb <- function(x, n) {

replicate(n, mean(sample(x, length(x), TRUE,

diff(c(0, sort(runif(length(x) - 1)), 1)))))

}

wrong.mean.fb <- function(x, n) {

replicate(n, mean(sample(sample(x, length(x), TRUE),

length(x), TRUE)))

}

set.seed(1)

reps <- 10000

x <- cars$dist

par(mar=c(5,4,1,2))

plot(density(ok.mean.fb(x, reps)), main = "", xlab = "Bootstrap mean")

lines(density(ok.mean.bb(x, reps)), col = "red")

lines(density(wrong.mean.fb(x, reps)), col = "blue")

lines(density(wrong.mean.bb(x, reps)), col = "green")

The figure it produces is:

Black curve is standard bootstrap density, red is Bayesian bootstrap and blue and green are generated by wrong bootstrapping procedures (respectively frequentist and Bayesian).

We can see that wrong Bayesian bootstrap has an equivalent in standard bootstrap approach that is generated by repeating the sampling twice (sampling from a sample) and it clearly increases dispersion of the results.

10 comments:

Jan GalkowskiNovember 11, 2012 at 5:34 PM
FYI, there is an implementation of Rubin's Bayesian bootstrap called "BayesianBootstrap" in the LaplacesDemon package of R, available on CRAN. I do not know which side they come down on the question of how the first bootstrap should be sampled, as I have not dug into the problem. Perhaps the LaplacesDemon implementation will add a third voice to the discussion.

I don't think Rubin's "Bayesian bootstrap" is used as much because it seems to simple confuse matters with stylistic distractions. Apparently, it has use in cases of censoring, e.g., for imputation. Efron discusses it in passing in his "Second thoughts on the Bootstrap" from 2003.
ReplyDelete
Replies
Bogumił KamińskiNovember 12, 2012 at 12:28 AM
BayesianBootstap in LaplacesDemon simply replicates standard bootstrap procedure in disguise not Bayesian bootstrap.

You can see it in the following part of the source code:

for (s in 1:S) {
if (s%%Status == 0)
cat("\nBootstrapped Samples:", s)
u <- c(0, sort(runif(N - 1)), 1)
g <- diff(u)
X.B[s, ] <- X[sample(1:N, 1, prob = g, replace = TRUE),
]
}

In each step of the loop only one observation is sampled (size parameter is equal to 1). And we notice that u and g are resampled every time in the loop also.

Therefore in each step of the loop is observation is sampled with probability 1/N. This is a standard bootstrap procedure.

To see this notice for example that running:

BayesianBootstrap(1:10,1)

10 times is equivalent to running:

BayesianBootstrap(1:10,10)
ReplyDelete
Replies
ByronNovember 12, 2012 at 6:06 AM
Hi Bogumil,

Thanks for showing this to me.

I think the two approaches here, rdirichlet and differenced sampling weights, are equivalent. The difference in the results, such as between ok.mean.bb and wrong.mean.bb, is due to repeated sampling in wrong.mean.bb with the same sampling weights. According to Chernik (2008), page 122:

"A second Bayesian bootstrap replication is generated in the same way, but
with a new set of n - 1 uniform random numbers and hence a new set of g[i]'s."

In wrong.mean.bb, the size argument of length(x) returns multiple draws, but each of these draws has the same sampling probabilities. I believe Chernik is using the word replication to refer to each draw, which could be confusing to readers here who observe the replicate function is used here for each collection of length(x) draws. My interpretation of Rubin (1981) and Chernik (2008) is that the sampling probabilities should be refreshed for each draw.

The BayesianBootstrap function in the LaplacesDemon package seems to be working correctly (whew!). For example, I see the same approximate results by substituting:

ok.mean.bb <- function(x, n) {
replicate(n, mean(BayesianBootstrap(x, length(x))), TRUE)
}

As a side note, the rdirichlet function in LaplacesDemon is much faster than that currently implemented in gtools. The BayesianBootstrap function is not using rdirichlet at the moment, and calculates it exactly as presented by Rubin (1981). The rdirichlet approach presented here is much faster.

One thing I discovered by working through this is that the current BayesianBootstrap function returns "The Bayesian Bootstrap has finished." numerous times

Before you run this example (and fill your screen with these messages!), I suggest replacing this line of source code in BayesianBootstrap:

cat("\n\nThe Bayesian Bootstrap has finished.\n\n")

with

if(Status < S) cat("\n\nThe Bayesian Bootstrap has finished.\n\n")

I will update this in the package for the next version.

Thanks again for pointing this discussion out to me.
ReplyDelete
Replies
ByronNovember 12, 2012 at 10:33 AM
If the BayesianBootstrap function is not working correctly in LaplacesDemon, I will gladly correct it, and have recently opened a forum for the public discussion of development, including bugs.

In this case, however, I think it is correct.

On the following page, Rubin shows in figure 1 a comparison between the BB and the bootstrap for correlation, and the results are very similar.

When I run the code in the previous comment, the BayesianBootstrap function estimates a distribution of the mean that is remarkably similar to the frequentist bootstrap. The ok.mean.bb function returns a distribution with stunning uncertainty, especially given 10,000 replications.

I see an article called "A large Sample Study of the Bayesian Bootstrap" (1987) by Albert Lo. Throughout the article, Lo notes the large sample properties are very similar between the bootstrap and the BB, and Lo is referring to large samples of replicates.

I think the BayesianBootstrap function is doing what Lo shows on the first page of his article, in steps 2 and 3, and that a test statistic such as the mean is calculated as such on the bootstrapped samples. Please let me know if this is incorrect. Thanks.
ReplyDelete
Replies
Bogumił KamińskiNovember 12, 2012 at 12:40 PM
Additionally you can look at original Rubin (1981) example data. Here is the code:

library(LaplacesDemon)
dye <- c(1.15, 1.7, 1.42, 1.38, 2.8, 4.7, 4.8, 1.41, 3.9)
efp <- c(1.38, 1.72, 1.59, 1.47, 1.66, 3.45, 3.87, 1.31, 3.75)
data.set <- data.frame(dye,efp)

sboot <- function() {
cor(data.set[sample(1:9, replace=T),])[1,2]
}

bboot <- function() {
cov.wt(data.set, diff(c(0,sort(runif(8)),1)), cor=T)$cor[1,2]
}

dboot <- function() {
cor(BayesianBootstrap(data.set, 9))[1,2]
}

set.seed(1)
par(mfrow=c(1,3))
hist(replicate(10000, sboot()), breaks=seq(-1,1, len=101), xlim=c(0.4,1))
hist(replicate(10000, bboot()), breaks=seq(-1,1, len=101), xlim=c(0.4,1))
hist(replicate(10000, dboot()), breaks=seq(-1,1, len=101), xlim=c(0.4,1))

You can see that dboot() looks like sboot() but not like bboot(). And in Rubin (1981) we can see that Bayesian bootstap should have less extreme results (near 1 and below 0.8). This can be seen in bboot() only.
ReplyDelete
Replies
ByronNovember 12, 2012 at 6:54 PM
Good stuff.

We may need a tie-breaker on the Lo article, because the way I read it:

Step 1 involves drawing the sampling probabilities from gaps from uniforms, and then

Step 2 involves drawing the bootstrap sample, though step 1, the sampling probabilities, are drawn for each bootstrap sample in step 2.

This agrees again with Chernik (2008), page 122:

"A second Bayesian bootstrap replication is generated in the same way, but wit ha new set of n - 1 uniform random numbers and hence a new set of g[i]'s."

The bboot in the above example does not do that. But I do like the look of your static-g bboot better for that last example though!

It's ok if we disagree over whether the main thrust of the Lo aricle is on data or bootstrap asymptotics (I'd suggest the last sentence of the first page, p. 360, supports bootstrap though), but the first thing to nail down is that the gaps are drawn for each bootstrap replicate.

I am wondering why you'd say that the BayesianBootstrap function doesn't approximate, but is exactly the same, when the probabilities g are not 1/n like in the bootstrap. I'm scratching my head on that one.

There may be other interesting things to explore here, but let's start with whether or not the gaps, calculated from uniform sampling, should differ for each bootstrap replication.
ReplyDelete
Replies
ByronNovember 14, 2012 at 7:07 AM
After carefully re-reading these articles from your perspective, I'm convinced you're right. I will correct the BayesianBootstrap function, and will also extend it to produce statistics rather than strictly samples. If it sounds good, I will email it to your for your feedback. Many thanks again for pointing this out to me.
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.