Unlocking the Complexity of Legal Language: Legal Language Corpus Construction

Glogar,  Ondřej

Publication details

Unlocking the Complexity of Legal Language: Legal Language Corpus Construction

Authors	GLOGAR Ondřej
Year of publication	2024
Type	Appeared in Conference without Proceedings
MU Faculty or unit	Faculty of Law
Citation
Description	While legal language's significance is well-recognized, research often lacks empirical data and comprehensive coverage. Legal theorists predominantly rely on personal linguistic experiences, underscoring the need for more robust methodologies (as noted, for instance, by Mouritsen, 2017). In this context, the integration of corpus linguistics proves invaluable for the analysis of legal language. Language corpora include large linguistic data accessible by software, facilitating easy testing of linguistic hypotheses. Therefore, if we want to investigate the actual linguistic reality, i.e., how (legal) language is used in practice and everyday life, the use of such a tool is essential. Thus, with such means, we can examine how law relates to language, and inherently, we can also come to a deeper understanding of law itself. This paper responds to the need for more empirical-based exploration in legal language research (and by extension, legal research) and shows, through a practical example, how corpus linguistics can be used in legal scholarship. In this paper, I introduce my own corpus of legal Czech, and in particular, the process of its creation. Among other things, this corpus is intended to fill a gap in existing legal corpus linguistics stemming from the fact that existing legal corpora tend to focus narrowly on specific genres, such as case law or statutes, rather than providing a complex picture. Thus, the main goal of the corpus I?am developing is a balanced and comprehensive corpus that includes representatives of all genres of legal language (cf. Tiersma, 2000). Drawing upon insights from applied linguistics literature (e.g., Meyer, 2002), I carefully analyse criteria for sample collection and segmentation, adapting them to the unique demands of legal language. The resulting corpus aims to encompass a broad spectrum of Czech legal texts, ensuring representation across different legal branches, speakers, and genres. Delving into the intricacies of the corpus creation process, I elucidate how the methodology contributes to a more robust comprehension of legal language. By presenting the rationale behind the corpus design and adaptations made for legal language specificity, this paper contributes valuable insights to the ongoing discourse on the role of language in legal discourse.
Related projects:	Current Issues in Theory and History of Law

10 reasons why you will fall in love with MU

Ask our ambassador

Read about research at MU

Unlocking the Complexity of Legal Language: Legal Language Corpus Construction