AUTOMATIC SPEECH RECOGNITION OF SPANISH: SOCIOLINGUISTIC FACTORS OF ACCURACY AND ERROR TYPOLOGY

Authors

  • Yuliia Tarasenko Master Student at Department of Romanic Philology Taras Shevchenko National University of Kyiv (Ministry of Education and Science of Ukraine) 14 Taras Shevchenko Blvd., Kyiv, Ukraine, 01601 , Taras Shevchenko National University of Kyiv image/svg+xml

DOI:

https://doi.org/10.17721/2663-6530.2025.48.11

Keywords:

automatic speech recognition, Spanish, ASR, WER, CER, accent, sociolinguistics

Abstract

The article investigates the effectiveness of Automatic Speech Recognition (ASR) for Spanish based on a corpus of 304 audio recordings of speakers of different ages, genders, and accents. The aim of the study is to evaluate the accuracy of Google Speech-to-Text, identify common errors, and determine the impact of sociolinguistic factors on transcription quality. The analysis employs WER and CER metrics, as well as the number of substitutions, deletions, and insertions. The results revealed an average accuracy of 94.7 %, with substitutions being the predominant error type. The highest accuracy was achieved for speakers with a northern peninsular accent, while the lowest was observed in teenagers and speakers of the Argentinian variety of Spanish. The practical value of this study lies in the possibility of improving ASR models by taking into account dialectal and social characteristics of speakers.

References

Dudchenko, I. V. (2020). Holosove upravlinnia komputerom na osnovi hlosariiu za dopomohoiu alhorytmiv rozpiznavannia movy [Diploma project, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”]. https://ela.kpi.ua/server/api/core/bitstreams/781f9949-a7c4-4033-bcd6-403b6449e866/content

Nakhood, O. (2025). Avtomatychne rozpiznavannia ukrains’koho movlennia na osnovi hlybokoho navchannia. https://doi.org/10.36074/logos-24.01.2025.043

Samvelian, A. R. (2021). Rozrobka systemy avtomatychnoho rozpiznavannia ukrains’koho movlennia [Diploma thesis, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”]. https://ela.kpi.ua/server/api/core/bitstreams/af954b61-6f2e-47b7-963b-aac8648b500f/content

Vintsiuk, T. K., Sazhok, M. M., Seliukh, R. A., Fedorin, D. Ya., Iukhymenko, O. A., & Robeiko, V. V. (2018). Avtomatychne rozpiznavannia, rozuminnia ta syntez movlennievykh syhnaliv v Ukraini. Upravliuiuchi systemy i mashyny, (6), 7–24. https://nasplib.isofts.kiev.ua/handle/123456789/161562

Ardila, R., Branson, M., Davis, K., Kohler, M., Meyer, J., Henretty, M., Morais, R., Saunders, L., Tyers, F., & Weber, G. (2019). Common Voice: A massively-multilingual speech corpus. https://arxiv.org/abs/1912.06670

Gómez Seibane, S., San Martín, M., Herras, J., & Mata, G. (2024). Is ASR a suitable tool for creating spoken linguistic corpora in European Spanish? Procesamiento del Lenguaje Natural, 73, 165–176. https://corpusrural.es/publicaciones/2024/GomezSeibane-et-AL-SEPLN-2024.pdf

Jurafsky, D., & Martin, J. H. (2018). Speech and language processing. Stanford University. https://web.stanford.edu/~jurafsky/slp3/

Maison, L., & Estève, Y. (2023, August). Some voices are too common: Building fair speech recognition systems using the Common Voice dataset. In Interspeech 2023 (ISCA). Dublin, Ireland. https://hal.archives-ouvertes.fr/hal-04163615

Rufiner, H. L., & Milone, D. H. (2004). Sistema de reconocimiento automático del habla. Ciencia, Docencia y Tecnología, XV(28), 151–177. https://www.redalyc.org/articulo.oa?id=14502806

Published

2025-10-07

Issue

Section

Articles

How to Cite

Tarasenko, Y. (2025). AUTOMATIC SPEECH RECOGNITION OF SPANISH: SOCIOLINGUISTIC FACTORS OF ACCURACY AND ERROR TYPOLOGY. PROBLEMS OF SEMANTICS, PRAGMATICS AND COGNITIVE LINGUISTICS, 1(48), 142-150. https://doi.org/10.17721/2663-6530.2025.48.11