Download RateEstimation.Rmd

The problem

Given a sample of size n of counts for an observed process, find an interval estimate of the population rate of \(\lambda\).

The frequentist approach

The commonly used frequentist confidence interval for a rate assumes normality of the estimator and is (under asymptotic theory) found using:

\[ \hat{\lambda} \pm z_{\alpha/2} \sqrt{\hat{\lambda}}\] where \(\hat{\lambda} = \bar{x} = /frac{1}{n}\sum_{i=1}^n {x_i}\) is the observed sample rate, and \(z_{\alpha/2}\) is the quantile from the standard normal distribution needed for the \(100(1-\alpha)%\) confidence interval.

The Bayesian approach

The bayesian estimator of the rate, and its credible interval, are found using the gamma distribution with parameters \(\alpha=\sum_{i=1}^n {x_i}\) and \(\beta=n\).

The gamma distribution is used because it is the conjugate distribution for the estimation of the Poisson rate parameter. In this format it is the non-informed prior that selects the parameters. The term “conjugate” means that use of a gamma prior distribution leads to a gamma posterior distribution.

The Bayesian point estimate for the population rate is the mean of this gamma distribution, being \(\frac{\alpha}{\beta}\), and the associated credible interval uses the quantiles of this gamma distribution. Finding these quantiles is easily done using the qgamma() function in R.

N.B. This quantle based interval has the desired coverage, but it is not the narrowest interval to have this coverage.

If a Gamma prior with parameters \(\alpha_0,\beta_0\), is believed suitable, then the posterior distribution is also Gamma with parameters \(\alpha=\alpha_0 + \sum_{i=1}^n {x_i}\) and \(\beta=\beta_0 + n\).

An example

Find the 95%confidence interval for \(\lambda\) in the situation when we have

x=c(5,23,9,15,11,7,12,8,8,19)
n=length(x)

The frequentist working would show:

LambdaHat = mean(x)
LambdaHat
[1] 11.7
FCI = LambdaHat + qnorm(c(0.025,0.975)) * sqrt(LambdaHat/n)
FCI
[1]  9.579975 13.820025

whereas the Bayesian working would be:

Alpha = sum(x)
Beta = n
LambdaHat = Alpha/Beta
LambdaHat
[1] 11.7
BCI = qgamma(c(0.025,0.975), Alpha, Beta)
BCI
[1]  9.676219 13.913095

Similarities and differences

The point estimates for the frequentist and the noninformed Bayesian approaches are the same. Note though that the frequentist interval is symmetric around the point estimate, while the Bayesian approach uses an asymmetric gamma distribution.

If there was some sort of informed prior then the values of \(\alpha,\beta\) would be different in the Bayesian working.