On Certain Cases in the History of Native American Languages as an Example for Developing Automatic Recognition of Agglutinative Languages
Keywords:
agglutinative languages, automatic speech recognition, morphological analysis, Native American languages, natural language processing, low-resource, neural networksAbstract
This paper examines characteristics of agglutinative languages using Native American languages (Na-Dene, Uto-Aztecan, and Quechuan families) as examples, and analyzes the challenges they pose for natural language processing (NLP) tasks, particularly automatic speech recognition (ASR) and morphological analysis. The historical development and structural diversity of these languages make them a valuable testbed for developing and evaluating algorithms robust to high morphological complexity. The study summarizes a methodology that combines statistical and neural approaches under limited language resources, discusses the obtained results, and outlines directions for future research.
References
1. Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing (3rd ed.). Prentice Hall.
2. Sproat, R. (2010). Language, Technology, and Society. Oxford University Press.
3. Bird, S. (2020). Decolonising Speech and Language Technology. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
4. Young, R. W., & Morgan, W. (1987). The Navajo Language: A Grammar and Colloquial Dictionary. University of New Mexico Press.
5. McDonough, J. (2003). The Navajo Sound System. Springer.
6. Bricker, V. R., Po'ot Yah, E., & Dzul de Po'ot, O. (1998). A Dictionary of the Maya Language as Spoken in Hocabá, Yucatán. University of Utah Press.
7. Lehmann, C. (1998). Possession in Yucatec Maya. LINCOM Europa.
8. Cusihuamán, A. (2001). Gramática del Quechua de Cuzco. Centro de Estudios Regionales Andinos "Bartolomé de las Casas".
9. Adelaar, W. F. H., with Muysken, P. C. (2004). The Languages of the Andes. Cambridge University Press.
10. Beyer, T. (2009). The Challenges of Statistical Language Modeling for Agglutinative Languages. Journal of Language Modelling, 1(1).
11. Kirchhoff, K., et al. (2002). Novel Approaches to Arabic Speech Recognition. In Proceedings of IEEE ICASSP.
12. Schuster, S., & Nakajima, K. (2012). Japanese and Korean Voice Search. In Proceedings of IEEE ICASSP.
13. Devlin, J., et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT.
14. Ruokolainen, T., et al. (2019). A Comparative Study of Neural Morphological Taggers for Finnish. Computational Linguistics, 45(4).
15. Müller, T., et al. (2015). Joint Lemmatization and Morphological Tagging with Lemming. In Proceedings of EMNLP.
16. Kuru, O., et al. (2016). CharNER: Character-Level Named Entity Recognition. In Proceedings of COLING.
17. Babu, A., et al. (2021). XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale. arXiv preprint arXiv:2111.09296.
18. Park, D. S., et al. (2019). SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. In Proceedings of Interspeech.
19. Hinton, G., et al. (2012). Deep Neural Networks for Acoustic Modeling in Speech Recognition. IEEE Signal Processing Magazine, 29(6).
20. Mager, M., et al. (2020). A Morphological Analyzer for Shipibo-Konibo. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC).
21. Conneau, A., et al. (2020). Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of ACL.
22. Pires, T., et al. (2019). How Multilingual is Multilingual BERT? In Proceedings of ACL.
23. Maddieson, I. (1984). Patterns of Sounds. Cambridge University Press.
24. Palmer, A., et al. (2009). The Cherokee National Corpus: A Collaborative Language Resource. In Proceedings of the 7th Workshop on Asian Language Resources.
25. Anastasopoulos, A., & Neubig, G. (2019). Pushing the Limits of Low-Resource Morphological Inflection. In Proceedings of EMNLP-IJCNLP.
26. Schwartz, L., et al. (2019). Green NLP: A Methodology for Assessing the Environmental Impact of NLP Models. arXiv preprint arXiv:1912.02160.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 О.С. Атыкенов, А.Б. Бакасова

This work is licensed under a Creative Commons Attribution 4.0 International License.
