Zde se nacházíte:
Informace o publikaci
A New Czech Pipeline in Sketch Engine
Autoři | |
---|---|
Rok publikování | 2024 |
Druh | Článek ve sborníku |
Konference | Recent Advances in Slavonic Natural Language Processing, RASLAN 2024 |
Fakulta / Pracoviště MU | |
Citace | |
www | Konferenční sborník |
Klíčová slova | Morphological analysis, corpora annotation |
Přiložené soubory | |
Popis | This paper introduces a new Czech pipeline that is now available in Sketch Engine. It describes the tools used for this pipeline and for some of them, we add details of how they were altered in recent years. The most complex part discusses adjustment of the training data used for Czech language – the DESAM corpus – and its effect on accuracy of the POS tagging performed by RFTagger. |