You are here:
Publication details
Rapid prototyping of a web categorization tool
Authors | |
---|---|
Year of publication | 2014 |
Type | Article in Proceedings |
Conference | IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium |
MU Faculty or unit | |
Citation | |
Web | http://dl.acm.org/citation.cfm?id=2628216 |
Doi | http://dx.doi.org/10.1145/2628194.2628216 |
Field | Informatics |
Keywords | web mining;categorization of web pages;machine learning;landmarking |
Description | This paper introduces a new method for fast prototyping of web page categorization tool based on Random Forests. The result of this work is three-fold. We describe a fast feature extraction method first. Afterwards, we introduce a system that enables a user to perform experiments manually and visualize the results via visual analytics module. The last part of this work concerns a way how to perform experiments efficiently. It is partially inspired by landmarking that allows limiting the number of experiments. This method has been used for building a new commercial system for web categorization that significantly outperforms the system already being used. |