Sunday, June 23, 2019
Data mining Essay Example | Topics and Well Written Essays - 3000 words
Data mining - Essay ExampleAutomated prospective analysis provided by the data mining techniques, as will be discussed below, go beyond the simple analysis of past records as availed by the retrospective tools used in decision support systems (DSS). These techniques of data mining were fundamentally as a result of the predominant long processes of research and product developments, with the first pressing need as to help in business data collection, retention and retrieval. Considering every aspects of data mining, the commonly used techniques are Artificial neural networks Biclustering PageRank Genetic algorithms Nearest neighbor methods Rule indications. A) Data Mining Classification oer large database 1. The kNN k-nearest neighbor classification This algorithm is works by memorizing the entire training data and performing classification on conditions that the attributes of the test object matches either of the training samples accurately. The kNN seeks a collection of k objects within the training set which closely associates with test object, and based the assignment of an indication on the predominance of any particular class in its neighborhood. The key factors in this algorithms include the distance or similarity metric to compute distance that exist between objects a set of the labeled objects and the number of nearest neighbor (value of k). Advantages It is simple and easy to understand It is easy to implement its classification techniques. It can also perform so well in varied situations, hence its maximum usability. It is known for its suitability for multi-modal classes and applications in which an object is able to have a number of class labels. Disadvantages The election of k is a limiting factor. If it (k) is too small, the result would be very sensitive to noise points. While if k is too large, the neighborhood is likely to present of a large number of points even from other classes. This test limits the numbers of tests records to be clas sified since it is true that such test records will not in most instances match any of the training records to the latter as recommended. The approach of combining the class labels is also considered as very complicated. 2. Page Rank This is classified as a search ranking algorithm that uses hyperlinks on the World Wide Web. Page Rank techniques produce static rankings of the Web pages in a manner that Page Rank value is accurately computed for each and every page that is off-line without depending on the search queries but rather on the pop nature of the World Wide Web through the use of its wide link architecture as an indicator of any individual page quality. It is worth(predicate) noting that these features have helped in the success of the famous Google search engine. Advantages It is quite dependable as its outputs are always accurate and precise. It is simple and efficient to use at once one has the knowledge and skills of its usability principle. Disadvantages Database se arch outcomes are based on literal (keywords, Meta data, and tags) items rather than on their actual meanings. Poor ranking of Web pages in contrastive topological Web structures. I.e. in Googles ranking algorithm. Less page ranks and too much time taken to list and gain high ranks for the new pages. ensuant quotation of inaccurate information on different web pages may lead to indexing of such inaccurate pages, hence resulting to a mess of fiction. 3. uninitiated Bayes Advantages It is
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.