Desarrollo de un modelo de traducción automática neuronal optimizado con BERT para la traducción del idioma quechua al español

Beatrice Cueva Medina, Gabriel Fabrizio Tuco Casquino, José Sulla-Torres

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

Quechua, a Native American language spoken by over 3 million people in Peru, plays a significant cultural role but is at risk of decline due to limited resources and the dominance of Spanish. This paper proposes a Quechua-to-Spanish neural machine translation (NMT) model using a Transformer-based architecture and a semi-supervised approach known as LMfusion. The model is trained on parallel datasets, and PRPE morphological segmentation is employed during preprocessing. Initial results show promise, and integrating the QuBERT language model is expected to enhance translation quality. Additionally, a user-friendly web interface has been developed to facilitate Quechua-Spanish translation. This research aims to address the challenges of translating a low-resource language like Quechua and contribute to improved communication between Quechua and Spanish speakers, preserving cultural heritage and facilitating equitable access to information and services.

Título traducido de la contribuciónDevelopment of a neural machine translation model optimized with BERT for translation from Quechua to Spanish
Idioma originalEspañol
Título de la publicación alojadaProceedings of the 22nd LACCEI International Multi-Conference for Engineering, Education and Technology
Subtítulo de la publicación alojadaSustainable Engineering for a Diverse, Equitable, and Inclusive Future at the Service of Education, Research, and Industry for a Society 5.0., LACCEI 2024
EditorialLatin American and Caribbean Consortium of Engineering Institutions
ISBN (versión digital)9786289520781
DOI
EstadoPublicada - 2024
Evento22nd LACCEI International Multi-Conference for Engineering, Education and Technology, LACCEI 2024 - Hybrid, San Jose, Costa Rica
Duración: 17 jul. 202419 jul. 2024

Serie de la publicación

NombreProceedings of the LACCEI international Multi-conference for Engineering, Education and Technology
ISSN (versión digital)2414-6390

Conferencia

Conferencia22nd LACCEI International Multi-Conference for Engineering, Education and Technology, LACCEI 2024
País/TerritorioCosta Rica
CiudadHybrid, San Jose
Período17/07/2419/07/24

Palabras clave

  • BERT
  • Low-resource Language
  • Neural Machine Translation
  • PRPE
  • Quechua
  • Transformer

Huella

Profundice en los temas de investigación de 'Desarrollo de un modelo de traducción automática neuronal optimizado con BERT para la traducción del idioma quechua al español'. En conjunto forman una huella única.

Citar esto