Minería de Patrones Secuenciales aplicada a la Predicción del Plegamiento de Proteínas

Translated title of the contribution: Mining of sequential patterns applied to the prediction of protein folding

J. Quintana-Zaez, Héctor R. Velarde-Bedregal, Guillermo Calderón-Ruiz, Cosme E. Santiesteban-Toca

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Sequence mining consists of finding statistically relevant patterns in data collections represented sequentially. These, are an important type of data, where it matters the order that occupy the elements in the set and that finds a wide range of applications in Bioinformatics and Computational Biology. The prediction of protein structures is one of these applications. Where, a protein is no more than a sequence of amino acids forming patterns known as alpha helices, beta sheets and turns. For purposes of our investigation, these collections or secondary structures would be the itemsets, while the amino acids that make up the entire sequence, the items. Despite multiple attempts to predict protein folding, the algorithms developed to date only reach a 35% effectiveness. That is why we propose SPMCcm, an algorithm based on the prediction of frequent sequences and a scheme of classifiers. Which uses the information provided by the amino acid sequence, in two stages. Where, the first stage learns of the interactions between the secondary structures of the proteins, which it extracts as frequent sequences or itemsets. Meanwhile, the second stage learns of the interaction between the amino acids present in the interacting structures or items. The experimental evaluation showed that SPMCcm behaves in a similar way, independently of the base classifier used, reaching accuracies in the prediction of up to 48%, higher than the 35% reported by the literature, without using large computational resources and possessing explanatory capacity.

Translated title of the contributionMining of sequential patterns applied to the prediction of protein folding
Original languageSpanish
Title of host publication17th LACCEI International Multi-Conference for Engineering, Education, and Technology
Subtitle of host publication"Industry, Innovation, and Infrastructure for Sustainable Cities and Communities", LACCEI 2019
PublisherLatin American and Caribbean Consortium of Engineering Institutions
ISBN (Electronic)9780999344361
DOIs
StatePublished - 2019
Event17th LACCEI International Multi-Conference for Engineering, Education, and Technology, LACCEI 2019 - Montego Bay, Jamaica
Duration: 24 Jul 201926 Jul 2019

Publication series

NameProceedings of the LACCEI international Multi-conference for Engineering, Education and Technology
Volume2019-July
ISSN (Electronic)2414-6390

Conference

Conference17th LACCEI International Multi-Conference for Engineering, Education, and Technology, LACCEI 2019
Country/TerritoryJamaica
CityMontego Bay
Period24/07/1926/07/19

Fingerprint

Dive into the research topics of 'Mining of sequential patterns applied to the prediction of protein folding'. Together they form a unique fingerprint.

Cite this