Performance evaluation of Machine learning algorithms in Biomedical Document Classification
Abstract
Document classification is a prevalent task in Natural Language Processing (NLP) with a broad range of applications in the biomedical domain. In biomedical engineering categorization of biomedical literature into predefined categories becomes a cumbersome task. Hence, building an automatic document classifier using Machine Learning (ML) algorithms for the biomedical databases emerges as a significant task among the scientific community. In addition, empirical evaluation of these state-of-the-art classifiers for biomedical document categorization also becomes a thrust area of research. Hence, this paper examines the deployment of the various forefront ML algorithms in automatic classification of benchmark biomedical datasets like Bio Creative Corpus III, Farm-Ads, and TREC 2006 genetics Track. Finally, the performance measures of the ML classifiers have been evaluated through standard classification metrics like accuracy, precision, recall, and f1-measure.
Keywords: Machine learning, Deep learning, Text Mining, document classification.