Extension of a standard balanced linguistic corpus built according to spaCy rules by connotative characteristics
Gorozhanov Alexey Ivanovich
Moscow State Linguistic University
Submitted: 11.10.2023
Abstract. The aim of the research is to develop the technology for automatically determining the sentiment of a text based on the existing author’s software package. The scientific novelty lies in the fact that the work proposes a structural and functional model of a fully automated process for assessing the sentiment of a text in conjunction with an analysis of its morphological characteristics; the technical terms “connotative amplitude” and “connotative density” are also introduced for the first time. The study built a database model that accommodates connotative numeric parameters; further, the program code for the “add-on” for the database generator has been written, which allows one to supplement the standard database with these parameters; finally, the technology was tested on the material of three novels by F. Kafka (“Castle”, “The Trial” and “America”) and two novels by E. M. Remarque (“All Quiet on the Western Front” and “Flotsam”) in the German language. As a result, it is proven that the “add-on” is a high-quality software product that does not cause technical failures and is capable of providing researchers with a whole set of connotative data for subsequent comprehensive interpretation of the text, on condition that the input tone dictionary is of high quality.
Key words and phrases: корпусная лингвистика, сбалансированный корпус, тональность текста, коннотация, немецкий язык, corpus linguistics, balanced corpus, sentiment of a text, connotation, German language
Open the whole article in PDF format. Free PDF-files viewer can be downloaded here.
References:
Altysheva M. A. Problemy i metody analiza russkoyazychnykh tekstov na predmet identifikatsii tonal'nosti // Vestnik Rossiiskogo novogo universiteta. Seriya: Slozhnye sistemy: modeli, analiz i upravlenie. 2023. № 3.
Glushak V. M. Otritsanie nemetskikh polyarnykh slov i vyrazhenii v avtomatizirovannom analize tonal'nosti teksta // Filologicheskie nauki. Voprosy teorii i praktiki. 2023. T. 16. Vyp. 10. https://doi.org/10.30853/phil20230510
Goncharov A. R., Lysenkova S. A., Nazin A. S. Formirovanie sinonimichnykh ryadov s ekspertnoi otsenkoi dlya polucheniya koeffitsientov emotsional'nosti slov // Uspekhi kibernetiki. 2023. T. 4. № 2. https://doi.org/10.51790/2712-9942-2023-4-2-06
Gorozhanov A. I. Eksperimental'noe modelirovanie bazy dannykh sbalansirovannogo lingvisticheskogo korpusa // Filologicheskie nauki. Voprosy teorii i praktiki. 2022. T. 15. Vyp. 10. https://doi.org/10.30853/phil20220563
Gorozhanov A. I., Stepanova D. V. Sostavlenie sbalansirovannogo korpusa khudozhestvennogo proizvedeniya (na materiale romanov F. Kafki) // Vestnik Moskovskogo gosudarstvennogo lingvisticheskogo universiteta. Gumanitarnye nauki. 2022. № 7 (862). https://doi.org/10.52070/2542-2197_2022_7_862_31
Gruzdeva A. S., Yur'ev R. N., Bessmertnyi I. A. Primenenie volnovoi modeli teksta k zadache sentiment-analiza // Nauchno-tekhnicheskii vestnik informatsionnykh tekhnologii, mekhaniki i optiki. 2022. T. 22. № 6. https://doi.org/10.17586/2226-1494-2022-22-6-1159-1165
Komarova E. V. Problema tsifrovogo etiketa v russkikh i angliiskikh mediatekstakh: na materiale migratsionnogo diskursa // Medialingvistika. 2023. T. 10. № 2. https://doi.org/10.21638/spbu22.2023.207
Loginova A. O. Podkhody k obnaruzheniyu sotsial'nykh internet-botov // Informatsiya i bezopasnost'. 2022. T. 25. № 2. https://doi.org/10.36622/VSTU.2022.25.2.005
Panfilova A. S., Ushakov D. V. Emotsional'nyi ton rossiiskogo, ital'yanskogo, nemetskogo i frantsuzskogo novostnogo internet-kontenta v period razvorachivaniya pandemii COVID-19 // Psikhologiya. Zhurnal Vysshei shkoly ekonomiki. 2022. T. 19. № 3. https://doi.org/10.17323/1813-8918-2022-3-562-586
Pronina E. V., Pronin D. D. Issledovatel'skii potentsial izucheniya korpusa proizvedenii russkoi literatury s pomoshch'yu tsifrovykh lingvisticheskikh metodov i tekhnologii iskusstvennogo intellekta (proekt Lensky) // Sovremennyi uchenyi. 2023. № 3.
Rabbimov I. M. Algoritm postroeniya ansamblya derev'ev reshenii dlya sentimental'nogo analiza teksta // Problemy vychislitel'noi i prikladnoi matematiki. 2022. № 6 (45).
Rudakovskii Ya. S. Analiz tonal'nosti reshenii po denezhno-kreditnoi politike Natsional'nogo banka Respubliki Belarus' s pomoshch'yu metodov mashinnogo obucheniya // Belorusskii ekonomicheskii zhurnal. 2023. № 3 (104). https://doi.org/10.46782/1818-4510-2023-3-115-126
Semenova M. O. Podkhody k sentiment-analizu // Vestnik Moskovskogo gosudarstvennogo lingvisticheskogo universiteta. Gumanitarnye nauki. 2022. № 12 (867). https://doi.org/10.52070/2542-2197_2022_12_867_83