Informace o publikaci

Creating an Annotated Health Record Dataset in a Limited-Resource Environment.

Autoři

ANETTA Krištof

Rok publikování 2023
Druh Článek ve sborníku
Konference Proceedings of the Seventeenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2023
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
www https://nlp.fi.muni.cz/raslan/2023/paper11.pdf
Klíčová slova Electronic health records; EHR; annotation; named entity recognition; NER; medical concept mining
Popis This paper demonstrates a workflow for creating a dataset of annotated electronic health records in an environment that is limited in terms of both language resources and expert availability. From preannotation using rule-based methods to the redundancy of multiple annotators per document and the resulting degrees of confidence for each annotation, including the possible avenues of data augmentation in order to be able to train large language models, this paper discusses the practical considerations of how to make the best of the resource-strapped situation shared by so many researchers who analyze health records.
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info