Abstract: The enormous content of information available on the World Wide Web makes it important topic for data mining research. Data mining techniques‟s application to the World Wide Web is known as Web mining where this term has been used in three different ways; which are Web Content Mining, Web Structure Mining and Web Usage Mining. Web Mining uses the data mining techniques to automatically discover and extract information from and its usage patterns. Web Crawling, also known as Web spider, an automated indexer, an web documents/services. It is used to discover useful information from the World-Wide Web ant or a Web scutter is a program that browses the world wide web in a systematically, automated manner for indexing the content of web pages and keep the copy of all the pages that it has already visited for later processing. Web crawling is important to collect data for business intelligence, for market research about the services offered by the user and to determine and assess trends in a given market, to collect user behaviour information so that product can perform better and to develop a relevant product with more relevant contents. In my thesis work, I have explore data mining tool RAPIDMINER and showed how the operator of Crawling Web can be simulated into RAPIDMINER and result of Web Crawling can be generated accordingly. In my work, I have also designed an effective Web Search Engine in which we can give the lengthy query which will save the time of the user in searching the query. Web is expending day by day and people generally rely on search engine to explore the web. In such a scenario it is the duty of service provider to provide proper, relevant and quality information to the internet user against their query submitted to the search engine. It is a challenge for service provider to provide proper, relevant and quality information to the internet user by using the web page contents and hyperlink between the web pages.
Keywords: Optimization, RAPIDMINER.
Title: A Prioritized Method of Searching a Keyword (Access over Internet) and Optimization
Author: Amit Singh, Sonia Arora
International Journal of Computer Science and Information Technology Research
ISSN 2348-120X (online), ISSN 2348-1196 (print)
Research Publish Journals