Zde se nacházíte:
Informace o publikaci
Topic Modelling of the Czech Supreme Court Decisions
Autoři | |
---|---|
Rok publikování | 2020 |
Druh | Další prezentace na konferencích |
Fakulta / Pracoviště MU | |
Citace | |
Přiložené soubory | |
Popis | Czech Supreme Court produces several thousands of court decisions per year. The Supreme court decisions are published almost unprocessed in the full-text with minimal fundamental metadata (date of the decision, docket number). This fact makes a case law research very time-consuming. Therefore, new automatic methods of processing court decisions need to be developed in order to improve ways how to retrieve more relevant case law efficiently. Topic modelling methods have the potential to cluster a large number of documents automatically or to provide new categories of relevant metadata to these documents. In this paper, two topic modelling methods - latent Dirichlet allocation and non-negative matrix factorization are applied to the corpus of Czech Supreme Court decisions. Several models for methods are trained and compared according to their coherence scores in order to find the best number of topics. Further manual qualitative analysis of the most coherent models is performed by authors. |
Související projekty: |