Journal of Probability and Statistics
Volume 2013 (2013), Article ID 602940, 9 pages
http://dx.doi.org/10.1155/2013/602940
Research Article

Growth Estimators and Confidence Intervals for the Mean of Negative Binomial Random Variables with Unknown Dispersion

1Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA
2Department of Statistics, University of California, Berkeley, CA 94720, USA

Received 8 April 2013; Accepted 1 July 2013

Academic Editor: Peter van der Heijden

Copyright © 2013 David Shilane and Derek Bean. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The negative binomial distribution becomes highly skewed under extreme dispersion. Even at moderately large sample sizes, the sample mean exhibits a heavy right tail. The standard normal approximation often does not provide adequate inferences about the data's expected value in this setting. In previous work, we have examined alternative methods of generating confidence intervals for the expected value. These methods were based upon Gamma and Chi Square approximations or tail probability bounds such as Bernstein's inequality. We now propose growth estimators of the negative binomial mean. Under high dispersion, zero values are likely to be overrepresented in the data. A growth estimator constructs a normal-style confidence interval by effectively removing a small, predetermined number of zeros from the data. We propose growth estimators based upon multiplicative adjustments of the sample mean and direct removal of zeros from the sample. These methods do not require estimating the nuisance dispersion parameter. We will demonstrate that the growth estimators' confidence intervals provide improved coverage over a wide range of parameter values and asymptotically converge to the sample mean. Interestingly, the proposed methods succeed despite adding both bias and variance to the normal approximation.