How formulaic are inquisition records? Some formal corpus-based measurements

Informace o publikaci

Autoři	ZBÍRAL David KOTZÉ Gideon SHAW Robert Laurence John
Rok publikování	2024
Druh	Další prezentace na konferencích
Fakulta / Pracoviště MU	Filozofická fakulta
Citace
Popis	In this paper, we begin to fill the gap through the analysis of a corpus of Latin-language medieval inquisition material including 15 dierent inquisition registers and amounting to ca. 1.4M tokens. We look at the repetitiveness and formulaicity of language in this corpus from two different perspectives: 1) lexicaldiversity, and (2) the degree of text similarity detected through text reuse detection algorithms. From each perspective, we compare the 15 registers with one another. We compare the registers on lexical diversity measures, n-gram frequency distributions, and text reuse patterns extracted with the Passim text reuse detection tool. We conclude that a large variation in text repetition exists among the registers, and based on our results concerning specific registers, we challenge the widespread notion that lower repetitiveness corresponds to higher historical reliability and vice versa.
Související projekty:	Networks of Dissent: Computational Modelling of Dissident and Inquisitorial Cultures in Medieval Europe

Jak na přijímačky