Osong Public Health Res Perspect.  2015 Apr;6(2):112-120. 10.1016/j.phrp.2015.01.006.

Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction

Affiliations
  • 1Database/Bioinformatics Laboratory, College of Electrical and Computer Engineering Chungbuk National University, Cheongju, Korea
  • 2Syntekabio Incorporated, Korea Institute of Science and Technology, Seoul, Korea
  • 3Graduate School of Health Science Business Convergence, Chungbuk National University, Cheongju, Korea
  • 4Medical Informatics∙Engineering, Korea National University of Transportation, Cheongju, Korea

Abstract


Objectives
Predicting protein function from the protein–protein interaction network is challenging due to its complexity and huge scale of protein interaction process along with inconsistent pattern. Previously proposed methods such as neighbor counting, network analysis, and graph pattern mining has predicted functions by calculating the rules and probability of patterns inside network. Although these methods have shown good prediction, difficulty still exists in searching several functions that are exceptional from simple rules and patterns as a result of not considering the inconsistent aspect of the interaction network.
Methods
In this article, we propose a novel approach using the sequential pattern mining method with gap-constraints. To overcome the inconsistency problem, we suggest frequent functional patterns to include every possible functional sequence—including patterns for which search is limited by the structure of connection or level of neighborhood layer. We also constructed a tree-graph with the most crucial interaction information of the target protein, and generated candidate sets to assign by sequential pattern mining allowing gaps.
Results
The parameters of pattern length, maximum gaps, and minimum support were given to find the best setting for the most accurate prediction. The highest accuracy rate was 0.972, which showed better results than the simple neighbor counting approach and link-based approach.
Conclusion
The results comparison with other approaches has confirmed that the proposed approach could reach more function candidates that previous methods could not obtain.

Keyword

frequent pattern mining with gap-constraint; graph pattern mining; protein function prediction; protein–protein interaction network; sequential pattern mining
Full Text Links
  • OPHRP
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr