A Novel Heuristic Rules for Candidate Answer Sentence Generation in Machine Reading Comprehension System using Linguistic Knowledge Document
Machine Reading (MR) is an active research field dealing with understanding of text to solve various natural language application needs. The reading comprehension with Multiple Choice Questions (MCQ) is one of the best tools for evaluating the Machine Reading Comprehension (MRC) System. The comprehension text is converted into a knowledge base called Linguistic Knowledge Document (LKD). MCQ contains set questions and set of candidate answer for each question. In this paper, we focus on different question types found in RACE dataset. It is a large scale dataset for machine reading comprehension. It contains almost 100,000 MCQ and about 28,000 comprehension passages. We classified the questions based on POS tagging, which is generated from Stanford parser. Based on the classification we derive heuristic rules for generating Candidate Answer Sentences (CAS) for MCQ. Such CAS used for finding correct answer for a Machine Reading Comprehension system using LKD.
Keywords: Machine Reading, Multiple Choice Question, Candidate Answer Sentence, Linguistic Knowledge Document.