NLPbased Stemming and Lemmatization approaches for Multilingual Search Indexing

  • Chayapathi A R, G Sunilkumar, Manjunathswamy B E, Thriveni J, Venugopal KR

Abstract

A multilingual search crawler gives numerous language search pages while users provide catchphrase in one language. The web search tool used for this, gives more itemized data for individuals who communicate in more than one language. Advanced semantic internet searcher must offer something other than significance. It must give more noteworthy knowledge beyond fixing semantics of basic classifications. Completely practical semantic web search tool gives probable semantic related outcomes to multilingual requests. This paper proposes to look at document retrieval process based on stemming and lemmatization. Stemming is a method to lessen all words with a similar stem to a typical structure though lemmatization expels inflectional endings and returns the words. The various methodologies of stemming and lemmatizations are discussed for three major Indian languages English, Hindi and Kannada. The analysis of the accuracy of lemmatization is done for all three languages by processing the prepared data sets. The exactness of the computed results lead to the development of better indexing in multilingual search engines.

Published
2020-04-01
How to Cite
Chayapathi A R, G Sunilkumar, Manjunathswamy B E, Thriveni J, Venugopal KR. (2020). NLPbased Stemming and Lemmatization approaches for Multilingual Search Indexing. International Journal of Advanced Science and Technology, 29(06), 9121 - 9134. Retrieved from https://sersc.org/journals/index.php/IJAST/article/view/32454