Mathematical Problems in Engineering — An Open Access Journal

Journal Menu

Mathematical Problems in Engineering
Volume 2012 (2012), Article ID 793490, 24 pages
http://dx.doi.org/10.1155/2012/793490

Research Article

Multiclass Boosting with Adaptive Group-Based kNN and Its Application in Text Categorization

Lei La, Qiao Guo, Dequan Yang, and Qimin Cao

School of Automation, Beijing Institute of Technology, Beijing 100081, China

Received 31 December 2011; Revised 30 March 2012; Accepted 26 April 2012

Academic Editor: Serge Prudhomme

Copyright © 2012 Lei La et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

AdaBoost is an excellent committee-based tool for classification. However, its effectiveness and efficiency in multiclass categorization face the challenges from methods based on support vector machine (SVM), neural networks (NN), naïve Bayes, and k-nearest neighbor (kNN). This paper uses a novel multi-class AdaBoost algorithm to avoid reducing the multi-class classification problem to multiple two-class classification problems. This novel method is more effective. In addition, it keeps the accuracy advantage of existing AdaBoost. An adaptive group-based kNN method is proposed in this paper to build more accurate weak classifiers and in this way control the number of basis classifiers in an acceptable range. To further enhance the performance, weak classifiers are combined into a strong classifier through a double iterative weighted way and construct an adaptive group-based kNN boosting algorithm (AGkNN-AdaBoost). We implement AGkNN-AdaBoost in a Chinese text categorization system. Experimental results showed that the classification algorithm proposed in this paper has better performance both in precision and recall than many other text categorization methods including traditional AdaBoost. In addition, the processing speed is significantly enhanced than original AdaBoost and many other classic categorization algorithms.