![Studijní programy](https://cdn.muni.cz/media/3757910/studijni-programy-student-jde-chodbou-masarykova-univerzita.jpg?mode=crop¢er=0.5,0.5&rnd=133754493890000000&heightratio=0.5&width=278)
Zde se nacházíte:
Informace o publikaci
Rapid prototyping of a web categorization tool
Autoři | |
---|---|
Rok publikování | 2014 |
Druh | Článek ve sborníku |
Konference | IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium |
Fakulta / Pracoviště MU | |
Citace | |
www | http://dl.acm.org/citation.cfm?id=2628216 |
Doi | http://dx.doi.org/10.1145/2628194.2628216 |
Obor | Informatika |
Klíčová slova | web mining;categorization of web pages;machine learning;landmarking |
Popis | This paper introduces a new method for fast prototyping of web page categorization tool based on Random Forests. The result of this work is three-fold. We describe a fast feature extraction method first. Afterwards, we introduce a system that enables a user to perform experiments manually and visualize the results via visual analytics module. The last part of this work concerns a way how to perform experiments efficiently. It is partially inspired by landmarking that allows limiting the number of experiments. This method has been used for building a new commercial system for web categorization that significantly outperforms the system already being used. |