You are here:
Publication details
A New Czech Pipeline in Sketch Engine
Authors | |
---|---|
Year of publication | 2024 |
Type | Article in Proceedings |
Conference | Recent Advances in Slavonic Natural Language Processing, RASLAN 2024 |
MU Faculty or unit | |
Citation | |
web | Konferenční sborník |
Keywords | Morphological analysis, corpora annotation |
Attached files | |
Description | This paper introduces a new Czech pipeline that is now available in Sketch Engine. It describes the tools used for this pipeline and for some of them, we add details of how they were altered in recent years. The most complex part discusses adjustment of the training data used for Czech language – the DESAM corpus – and its effect on accuracy of the POS tagging performed by RFTagger. |