Quantitative Linguistic Study of Frequency Words in Kirill of Turov’s Words (based on the NLR manuscript F.п.I.39)

Victor A. Baranov, Oleg F. Zholobov


The authors have studied quantitative and statistical qualities of the most frequent words in sermons of Kirill of Turov, contained in the Tolstoy Collection from the 13th century (NLR, F.п.I.39).

In the course of three experiments, firstly, formal distinctions were found between the list and the corresponding copies from 8 contrasting sub-corpora, them being: 11th–14th century copies of the May Menaea, other months’ Minaea, Sticheraria, Gospels, The Book of Psalms, chronicles, the Apostolos, and the Parenesis of Ephrem the Syrian; the last two appear to be the most similar to the list. Secondly, using Log-Likelihood, TF*ICTF' and Weirdness statistical tools, statistically meaningful words were found out, and a partial overlap in the forms under study appeared between the texts of Kirill and several of the sub-corpora. Thirdly, by comparing ranks of each of the forms, the closeness of the Tolstoy Collection texts and sub-corpora of different genres was estimated, and it was shown that original sermons of Kirill of Turov and translations of the teaching sermons of Ephrem the Syrian and of the Apostolos are closest to each other in terms of statistical significance of 15 most frequent forms.

For the first time, the configurations of the most significant lexemes in the sub-corpora were found out. Also for the first time, their list was found to be similar in the sub-corpora of Kirill of Turov’s sermons and of the Apostolos, as well as (partially) of the Parenesis, The Book of Psalms and the chronicles. High-rank units in the sermons of Kirill of Turov (нъ, о, бо, съ) were described in terms of linguistics, of genre and style, and of discursive pragmatics.

DOI: 10.31168/2305-6754.2020.9.1.2


quantitative linguistics; historical text corpus; medieval Slavic texts; sermon; Kirill of Turov


