Classification of Primary Medical Records with RUBRYX-2: First Experience

Publication details

Authors	KRAUROVA Olga ALEXANDROV Mikhail BOUREK Aleš
Year of publication	2012
Type	Chapter of a book
MU Faculty or unit	Faculty of Medicine
Citation
Description	RUBRYX is a document classifier developed in 2000s for processing large volumes of Web information. RUBRYX uses weighted sum of n-grams (n=1,2,3) extracted from a very limited number of samples (about 5-10) and takes into account their mutual position in a given text. This sophisticated algorithm proves to be very effective in classifying primary medical records presented in a free text form. In the paper we study possibilities of RUBRYX (version 2.2) on a limited document set in Spanish. These documents are medical histories related to stomach diseases. Such area should be considered as a narrow subset of medical records. The high quality of archived results (accuracy 80%-90%) allows us to recommend RUBRYX for similar applications.

10 reasons why you will fall in love with MU