Geoinformation Web Resource “The Dialect Corpus of the Buryat Language”
Rinchinov Oleg Sergeevich, Abaeva Iuliia Dogorzhapovna
Federal State Budgetary Institution of Science "Institute for Mongolian, Buddhist and Tibetan Studies of the Siberian Branch of the Russian Academy of Sciences"
Submitted: 12.12.2022
Abstract. The study aims to provide open access to structured and annotated sound data of the dialect corpus of the Buryat language. It was decided to present the corpus on the Web in the form of a geoinformation system with data binding to a digital map, since the territorial principle plays one of the leading roles in the classification of Buryat dialects. First, the programme of the speech corpus was compiled and sound recordings performed by informants - speakers of the dialects were obtained. The recorded material was segmented and annotated in the ELAN programme. The next step was to develop a programme that allows transferring data from ELAN format files to a relational database. To present data on the Internet, a web application was developed in the form of an interactive digital map based on Google Maps Platform. As a result, a web resource was created that provides users with access to audio dialect data presented in an annotated and structured form and displayed according to the geographic principle. Scientific novelty lies in introducing into scientific and public use materials of a fundamentally new type that make it possible to obtain information about the modern sound of Buryat dialects, as well as to conduct research on modern Buryat dialect speech.
Key words and phrases: звуковой корпус, бурятский язык, диалект, аннотирование, геоинформационная система, sound corpus, Buryat language, dialect, annotation, geoinformation system
Open the whole article in PDF format. Free PDF-files viewer can be downloaded here.
References:
Bondarko L. V., Skrelin P. A., Vol'skaya N. B., Sherstinova T. Yu. Elektronnye zvukovye kollektsii v Internet // Elektronnye biblioteki. 2000. T. 3. № 1.
Buraev I. D. Problemy klassifikatsii buryatskikh dialektov // Problemy buryatskoi dialektologii: sb. st. Ulan-Ude: Izd-vo Buryatskogo nauchnogo tsentra Sibirskogo otdeleniya Rossiiskoi akademii nauk, 1996.
Vatlina T. V., Lun'kova E. S. Prostranstvennyi analiz dialektnogo slovoobrazovatel'nogo tipa, funktsioniruyushchego na territorii Smolenskoi oblasti // InterKarto. InterGIS. 2015. T. 21.
Vladimirov V. N. Istoriya i geografiya: puti vzaimodeistviya // Vestnik Novosibirskogo gosudarstvennogo universiteta. Seriya «Istoriya, filologiya». 2005. T. 4. Vyp. 2. Istoriya.
Dyrkheeva G. A. Literaturnyi buryatskii yazyk: istoriya i problematika // Vestnik Buryatskogo nauchnogo tsentra Sibirskogo otdeleniya Rossiiskoi akademii nauk. 2014. № 1 (13).
Zhdanova E. A., Belykh A. A. Geograficheskie informatsionnye sistemy v lingvisticheskikh issledovaniyakh // Intellektual'nye sistemy v proizvodstve. 2014. № 2 (24).
Krivnova O. F., Smirnova O. S. Introspektivnaya prosodicheskaya razmetka pis'mennogo teksta i ego real'noe ozvuchivanie (sravnitel'nyi analiz na materiale kollektsii tekstov R. I. Avanesova) // Komp'yuternaya lingvistika i intellektual'nye tekhnologii: po mat. ezhegod. mezhdunar. konf. «Dialog» (g. Moskva, 29 maya - 1 iyunya 2019 g.). M., 2019. Vyp. 18 (25).
Rassadin V. I. Prisayanskaya gruppa buryatskikh govorov. Ulan-Ude: Izd-vo Buryatskogo nauchnogo tsentra Sibirskogo otdeleniya Rossiiskoi akademii nauk, 1996.
Rasskazy o snovideniyakh: korpusnoe issledovanie ustnogo russkogo diskursa / pod red. A. A. Kibrika i V. I. Podlesskoi. M.: Yazyki slavyanskikh kul'tur, 2009.
Funktsional'nye trebovaniya k avtoritetnym dannym: kontseptual'naya model': zaklyuchitel'nyi otchet, dekabr' 2008 / pod red. G. E. Patona; per. s angl. O. A. Lavrenovoi. SPb.: Rossiiskaya natsional'naya biblioteka, 2011.
Andriyanets V., Daniel M., Pakendorf B. Discovering Dialectal Differences Based on Oral Corpora // Komp'yuternaya lingvistika i intellektual'nye tekhnologii: po mat. ezhegod. mezhdunar. konf. «Dialog» (g. Moskva, 30 maya - 2 iyunya 2018 g.) / pod obshch. red.: V. Selegei, I. M. Kobozeva, T. E. Yanko, I. Boguslavskii, L. L. Iomdin, M. A. Krongauz, A. Ch. Piperski. M.: Rossiiskii gosudarstvennyi gumanitarnyi universitet, 2018. Vyp. 17 (24).
Kachkovskaya T. V., Kocharov D. A., Skrelin P. A., Volskaya N. B. CoRuSS - a New Prosodically Annotated Corpus of Russian Spontaneous Speech // Proceedings of the Tenth Conference on International Language Resources and Evaluation (LREC’16) (Portorož, May 23-28, 2016). Portorož: European Language Resources Association, 2016.
Pennington R. Producing Time-Aligned Interlinear Texts: Towards a SayMore-FLEx-ELAN Workflow. 2014. URL: https://www.academia.edu/6474779/Producing_time_aligned_interlinear_texts_Towards_a_SayMore_FLEx_ELAN_workflow
Wittenburg P., Brugman H., Russel A., Klassmann A., Sloetjes H. ELAN: A Professional Framework for Multimodality Research // Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation. 2006. URL: http://www.lrec-conf.org/proceedings/lrec2006/pdf/153_pdf.pdf