Reading Tools

Indexing metadata

Digitizing Cyrillic Manuscripts for the Historical Dictionary of the Serbian Language Using Handwritten Text Recognition Technology


Dublin Core		PKP Metadata Items	Metadata for this Document

1.	Title	Title of document	Digitizing Cyrillic Manuscripts for the Historical Dictionary of the Serbian Language Using Handwritten Text Recognition Technology

2.	Creator	Author's name, affiliation, country	Vladimir Polomac; University of Kragujevac Jovana Cvijića bb, 34000 Kragujevac; Serbia

2.	Creator	Author's name, affiliation, country	Marina Kurešević; University of Novi Sad Zorana Đinđića 2, 21 000 Novi Sad; Serbia

2.	Creator	Author's name, affiliation, country	Isidora Bjelaković; University of Novi Sad Zorana Đinđića 2, 21 000 Novi Sad; Serbia

2.	Creator	Author's name, affiliation, country	Aleksandra Colić Jovanović; University of Novi Sad Zorana Đinđića 2, 21 000 Novi Sad; Serbia

2.	Creator	Author's name, affiliation, country	Sanja Petrović; University of Novi Sad Zorana Đinđića 2, 21 000 Novi Sad; Serbia

3.	Subject	Discipline(s)	linguistics; lexicography; digital humanities

3.	Subject	Keyword(s)	Transkribus; automatic text recognition; artificial intelligence; machine learning; historical lexicography; serbian language; Gavril Stefanović Venclović

4.	Description	Abstract	The paper explores the possibilities of using information technologies based on the principles of machine learning and artificial intelligence in the process of digitizing Cyrillic manuscripts for the purposes of creating a historical dictionary of the Serbian language. Empirical research is based on the use of the Transkribus software platform in the creation of a model for automatic text recognition of the manuscripts by Gavril Stefanović Venclović, the most significant and prolific Serbian cultural enthusiast of the 18th century, whose extensive manuscript legacy in Serbian vernacular represents the most significant primary source for the historical dictionary of the Serbian language of this period. Following the results of conducted research, it can be concluded that the process of digitizing Cyrillic manuscripts for the purposes of creating a historical dictionary of the Serbian language can be significantly accelerated using Transkribus by creating specific and generic models for automatic text recognition. The advantage of automatic text recognition compared to the traditional methods is particularly reflected in the possibility of continuous improvement of the performance of specific and generic models in accordance with the progress of the transcription process and the increase in the amount of digitized text that can be used to train a new version of the model. DOI: 10.31168/2305-6754.2023.1.08

5.	Publisher	Organizing agency, location

6.	Contributor	Sponsor(s)	The paper was financed by the Ministry of Education, Science and Technological Development of the Republic of Serbia and German Academic Exchange Service (DAAD)

7.	Date	(YYYY-MM-DD)	2023-10-19

8.	Type	Status & genre	Peer-reviewed Article

8.	Type	Type

9.	Format	File format	PDF

10.	Identifier	Uniform Resource Identifier	https://slovene.ru/ojs/index.php/slovene/article/view/607

11.	Source	Title; vol., no. (year)	Slověne = Словѣне. International Journal of Slavic Studies; Vol 12, No 1 (2023)

12.	Language	English=en	en

13.	Relation	Supp. Files

14.	Coverage	Geo-spatial location, chronological period, research sample (gender, age, etc.)	Serbia

15.	Rights	Copyright and permissions	Copyright (c) 2023 Vladimir Polomac, Marina Kurešević, Isidora Bjelaković, Aleksandra Colić Jovanović, Sanja Petrović This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

Slověne Reading Tools

Indexing metadata

Digitizing Cyrillic Manuscripts for the Historical Dictionary of the Serbian Language Using Handwritten Text Recognition Technology