Exploring Through Data Mining
The term “data mining” is an accurate description of the process. It draws a good analogy between mining for coal and searching through data for key information. I personally feel that of the two methods discussed in Cohen’s article “From Babel to Knowledge: Data Mining Large Digital Collections,” the QA H-bot method is a more effective use of searching than using document classification through the syllabus finder. While the QA H-bot method is more challenging, I feel that it provides a more straight foward method of searching for information.
The QA is considered “a far greater challenge than document classification because it exercises almost all of the computational muscles,” according to Cohen. This makes sense, because asking a question requires a more direct a specific approach of data mining. However, I feel that if executed correctly it can be more useful and helpful. The information will be focused towards finding the answer to the question, rather than just information related to the topic. It may be more difficult to find data, but once it is found it will be more specific.
I feel one of the biggest benefits to this approach is that the entire web can be used to answer questions through text analysis and algorithims. The feature also has access to “trusted sources,” which enhances its credibility. While it may be easier to use a document classification approach, the H-bot method provides valuable sources in addition to a large variety of information.
Leave a Reply