Journal of Probability and Statistics
Volume 2012 (2012), Article ID 524724, 19 pages
http://dx.doi.org/10.1155/2012/524724
Research Article

Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants

1Department of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, NY 10461, USA
2Department of Applied Mathematics and Institute of Statistics, National Chung Hsing University, Taichung 402, Taiwan
3Department of Applied Mathematics and Statistics, Stony Brook University, New York, NY 11794, USA

Received 23 March 2012; Accepted 6 June 2012

Academic Editor: Wei T. Pan

Copyright © 2012 Tao Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.