Desarrollo de un modelo de traducción automática neuronal optimizado con BERT para la traducción del idioma quechua al español

Translated title of the contribution: Development of a neural machine translation model optimized with BERT for translation from Quechua to Spanish

Beatrice Cueva Medina, Gabriel Fabrizio Tuco Casquino, José Sulla-Torres

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Quechua, a Native American language spoken by over 3 million people in Peru, plays a significant cultural role but is at risk of decline due to limited resources and the dominance of Spanish. This paper proposes a Quechua-to-Spanish neural machine translation (NMT) model using a Transformer-based architecture and a semi-supervised approach known as LMfusion. The model is trained on parallel datasets, and PRPE morphological segmentation is employed during preprocessing. Initial results show promise, and integrating the QuBERT language model is expected to enhance translation quality. Additionally, a user-friendly web interface has been developed to facilitate Quechua-Spanish translation. This research aims to address the challenges of translating a low-resource language like Quechua and contribute to improved communication between Quechua and Spanish speakers, preserving cultural heritage and facilitating equitable access to information and services.

Translated title of the contributionDevelopment of a neural machine translation model optimized with BERT for translation from Quechua to Spanish
Original languageSpanish
Title of host publicationProceedings of the 22nd LACCEI International Multi-Conference for Engineering, Education and Technology
Subtitle of host publicationSustainable Engineering for a Diverse, Equitable, and Inclusive Future at the Service of Education, Research, and Industry for a Society 5.0., LACCEI 2024
PublisherLatin American and Caribbean Consortium of Engineering Institutions
ISBN (Electronic)9786289520781
DOIs
StatePublished - 2024
Event22nd LACCEI International Multi-Conference for Engineering, Education and Technology, LACCEI 2024 - Hybrid, San Jose, Costa Rica
Duration: 17 Jul 202419 Jul 2024

Publication series

NameProceedings of the LACCEI international Multi-conference for Engineering, Education and Technology
ISSN (Electronic)2414-6390

Conference

Conference22nd LACCEI International Multi-Conference for Engineering, Education and Technology, LACCEI 2024
Country/TerritoryCosta Rica
CityHybrid, San Jose
Period17/07/2419/07/24

Fingerprint

Dive into the research topics of 'Development of a neural machine translation model optimized with BERT for translation from Quechua to Spanish'. Together they form a unique fingerprint.

Cite this