
How formulaic are inquisition records? Some formal corpus-based measurements
Autoři | |
---|---|
Rok publikování | 2024 |
Druh | Další prezentace na konferencích |
Fakulta / Pracoviště MU | |
Citace | |
Popis | In this paper, we begin to fill the gap through the analysis of a corpus of Latin-language medieval inquisition material including 15 dierent inquisition registers and amounting to ca. 1.4M tokens. We look at the repetitiveness and formulaicity of language in this corpus from two different perspectives: 1) lexicaldiversity, and (2) the degree of text similarity detected through text reuse detection algorithms. From each perspective, we compare the 15 registers with one another. We compare the registers on lexical diversity measures, n-gram frequency distributions, and text reuse patterns extracted with the Passim text reuse detection tool. We conclude that a large variation in text repetition exists among the registers, and based on our results concerning specific registers, we challenge the widespread notion that lower repetitiveness corresponds to higher historical reliability and vice versa. |
Související projekty: |