Data Gathered with Automatic Tools from European Parliamentary Chambers
Authors | |
---|---|
Year of publication | 2023 |
Type | Article in Proceedings |
Conference | Recent Advances in Slavonic Natural Language Processing, RASLAN 2023 |
MU Faculty or unit | |
Citation | |
web | Článek ve sborníku |
Keywords | parliamentary protocols, continuous downloading, corpus processing, automatic tools, corpus development, automatic maintenance of tools |
Attached files | |
Description | This paper reflects on the set of tools developed in my bachelor’s thesis, titled ”Continuous Automatic Development of European Parliamentary Corpora.” Despite the existence of numerous corpora offering speeches from the parliaments of the European Union, the developed toolset is designed to gather and build such corpora with minimal human intervention. With nine months of practical application, this paper presents insights into the faced challenges and their respective solutions, providing an overview since the initial release of the toolset. |