Publication details

A New Czech Pipeline in Sketch Engine

Authors

OHLÍDALOVÁ Vlasta JAKUBÍČEK Miloš

Year of publication 2024
Type Article in Proceedings
Conference Recent Advances in Slavonic Natural Language Processing, RASLAN 2024
MU Faculty or unit

Faculty of Informatics

Citation
web Konferenční sborník
Keywords Morphological analysis, corpora annotation
Attached files
Description This paper introduces a new Czech pipeline that is now available in Sketch Engine. It describes the tools used for this pipeline and for some of them, we add details of how they were altered in recent years. The most complex part discusses adjustment of the training data used for Czech language – the DESAM corpus – and its effect on accuracy of the POS tagging performed by RFTagger.

You are running an old browser version. We recommend updating your browser to its latest version.

More info