Text Mining Based on Tax Comments as Big Data Analysis Using XGBOOST and Feature Selection
With the quick improvement of the Internet, enormous information has been applied in a lot of use. Be that as it may, there are regularly excess or unessential highlights in high dimensional information, so include determination is especially significant. By building subsets with new highlights and utilizing AI calculations including Xgboost and so on. To acquire early notice data with high dependability and constant by applying large information hypothesis, systems, models and techniques just as AI strategies are the unavoidable patterns later on. this examination proposed the fast choice of highlights by utilizing XGboost model in dispersed circumstances can improve the Model preparing proficiency under conveyed condition.GBTs model dependent on the inclination streamlining choice tree was superior to the next two models as far as precision and continuous execution, which meets the necessities under the large information foundation. It runs on a solitary machine, just as the conveyed preparing structures Apache Hadoop, Apache Spark.We can utilize inclination plummet for our slope boosting model. On account of a relapse tree, leaf hubs produce a normal inclination among tests with comparative highlights. Highlight determination is a basic advance in information preprocessing and significant research content in information mining and AI assignments, for example, order.