TY - GEN
T1 - Desarrollo de un modelo de traducción automática neuronal optimizado con BERT para la traducción del idioma quechua al español
AU - Cueva Medina, Beatrice
AU - Fabrizio Tuco Casquino, Gabriel
AU - Sulla-Torres, José
N1 - Publisher Copyright:
© 2024 Latin American and Caribbean Consortium of Engineering Institutions. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Quechua, a Native American language spoken by over 3 million people in Peru, plays a significant cultural role but is at risk of decline due to limited resources and the dominance of Spanish. This paper proposes a Quechua-to-Spanish neural machine translation (NMT) model using a Transformer-based architecture and a semi-supervised approach known as LMfusion. The model is trained on parallel datasets, and PRPE morphological segmentation is employed during preprocessing. Initial results show promise, and integrating the QuBERT language model is expected to enhance translation quality. Additionally, a user-friendly web interface has been developed to facilitate Quechua-Spanish translation. This research aims to address the challenges of translating a low-resource language like Quechua and contribute to improved communication between Quechua and Spanish speakers, preserving cultural heritage and facilitating equitable access to information and services.
AB - Quechua, a Native American language spoken by over 3 million people in Peru, plays a significant cultural role but is at risk of decline due to limited resources and the dominance of Spanish. This paper proposes a Quechua-to-Spanish neural machine translation (NMT) model using a Transformer-based architecture and a semi-supervised approach known as LMfusion. The model is trained on parallel datasets, and PRPE morphological segmentation is employed during preprocessing. Initial results show promise, and integrating the QuBERT language model is expected to enhance translation quality. Additionally, a user-friendly web interface has been developed to facilitate Quechua-Spanish translation. This research aims to address the challenges of translating a low-resource language like Quechua and contribute to improved communication between Quechua and Spanish speakers, preserving cultural heritage and facilitating equitable access to information and services.
KW - BERT
KW - Low-resource Language
KW - Neural Machine Translation
KW - PRPE
KW - Quechua
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85203815433&partnerID=8YFLogxK
U2 - 10.18687/LACCEI2024.1.1.1636
DO - 10.18687/LACCEI2024.1.1.1636
M3 - Contribución a la conferencia
AN - SCOPUS:85203815433
T3 - Proceedings of the LACCEI international Multi-conference for Engineering, Education and Technology
BT - Proceedings of the 22nd LACCEI International Multi-Conference for Engineering, Education and Technology
PB - Latin American and Caribbean Consortium of Engineering Institutions
T2 - 22nd LACCEI International Multi-Conference for Engineering, Education and Technology, LACCEI 2024
Y2 - 17 July 2024 through 19 July 2024
ER -