Pattern Recognition in Computational Molecular Biology : Techniques and Approaches
Pattern Recognition in Computational Molecular Biology : Techniques and Approaches
Click to enlarge
Author(s): Elloumi, Mourad
Iliopoulos, Costas
Wang, Jason T. L.
Zomaya, Albert Y.
ISBN No.: 9781119078845
Pages: 656
Year: 201512
Format: E-Book
Price: $ 201.11
Status: Out Of Print

LIST OF CONTRIBUTORS xxi PREFACE xxvii I PATTERN RECOGNITION IN SEQUENCES 1 1 COMBINATORIAL HAPLOTYPING PROBLEMS 3 Giuseppe Lancia 1.1 Introduction / 3 1.2 Single Individual Haplotyping / 5 1.2.1 The Minimum Error Correction Model / 8 1.2.2 Probabilistic Approaches and Alternative Models / 10 1.3 Population Haplotyping / 12 1.


3.1 Clark''s Rule / 14 1.3.2 Pure Parsimony / 15 1.3.3 Perfect Phylogeny / 19 1.3.4 Disease Association / 21 1.


3.5 Other Models / 22 References / 23 2 ALGORITHMIC PERSPECTIVES OF THE STRING BARCODING PROBLEMS 28 Sima Behpour and Bhaskar DasGupta 2.1 Introduction / 28 2.2 Summary of Algorithmic Complexity Results for Barcoding Problems / 32 2.2.1 Average Length of Optimal Barcodes / 33 2.3 Entropy-Based Information Content Technique for Designing Approximation Algorithms for String Barcoding Problems / 34 2.4 Techniques for Proving Inapproximability Results for String Barcoding Problems / 36 2.


4.1 Reductions from Set Covering Problem / 36 2.4.2 Reduction from Graph-Coloring Problem / 38 2.5 Heuristic Algorithms for String Barcoding Problems / 39 2.5.1 Entropy-Based Method with a Different Measure for Information Content / 39 2.5.


2 Balanced Partitioning Approach / 40 2.6 Conclusion / 40 Acknowledgments / 41 References / 41 3 ALIGNMENT-FREE MEASURES FOR WHOLE-GENOME COMPARISON 43 Matteo Comin and Davide Verzotto 3.1 Introduction / 43 3.2 Whole-Genome Sequence Analysis / 44 3.2.1 Background on Whole-Genome Comparison / 44 3.2.2 Alignment-Free Methods / 45 3.


2.3 Average Common Subword / 46 3.2.4 Kullback-Leibler Information Divergence / 47 3.3 Underlying Approach / 47 3.3.1 Irredundant Common Subwords / 48 3.3.


2 Underlying Subwords / 49 3.3.3 Efficient Computation of Underlying Subwords / 50 3.3.4 Extension to Inversions and Complements / 53 3.3.5 A Distance-Like Measure Based on Underlying Subwords / 53 3.4 Experimental Results / 54 3.


4.1 Genome Data sets and Reference Taxonomies / 54 3.4.2 Whole-Genome Phylogeny Reconstruction / 56 3.5 Conclusion / 61 Author''s Contributions / 62 Acknowledgments / 62 References / 62 4 A MAXIMUM LIKELIHOOD FRAMEWORK FOR MULTIPLE SEQUENCE LOCAL ALIGNMENT 65 Chengpeng Bi 4.1 Introduction / 65 4.2 Multiple Sequence Local Alignment / 67 4.2.


1 Overall Objective Function / 67 4.2.2 Maximum Likelihood Model / 68 4.3 Motif Finding Algorithms / 70 4.3.1 DEM Motif Algorithm / 70 4.3.2 WEM Motif Finding Algorithm / 70 4.


3.3 Metropolis Motif Finding Algorithm / 72 4.3.4 Gibbs Motif Finding Algorithm / 73 4.3.5 Pseudo-Gibbs Motif Finding Algorithm / 74 4.4 Time Complexity / 75 4.5 Case Studies / 75 4.


5.1 Performance Evaluation / 76 4.5.2 CRP Binding Sites / 76 4.5.3 Multiple Motifs in Helix-Turn-Helix Protein Structure / 78 4.6 Conclusion / 80 References / 81 5 GLOBAL SEQUENCE ALIGNMENT WITH A BOUNDED NUMBER OF GAPS 83 Carl Barton, Tomás Flouri, Costas S. Iliopoulos, and Solon P.


Pissis 5.1 Introduction / 83 5.2 Definitions and Notation / 85 5.3 Problem Definition / 87 5.4 Algorithms / 88 5.5 Conclusion / 94 References / 95 II PATTERN RECOGNITION IN SECONDARY STRUCTURES 97 6 A SHORT REVIEW ON PROTEIN SECONDARY STRUCTURE PREDICTION METHODS 99 Renxiang Yan, Jiangning Song, Weiwen Cai, and Ziding Zhang 6.1 Introduction / 99 6.2 Representative Protein Secondary Structure Prediction Methods / 102 6.


2.1 Chou-Fasman / 103 6.2.2 GOR / 104 6.2.3 PHD / 104 6.2.4 PSIPRED / 104 6.


2.5 SPINE-X / 105 6.2.6 PSSpred / 105 6.2.7 Meta Methods / 105 6.3 Evaluation of Protein Secondary Structure Prediction Methods / 106 6.3.


1 Measures / 106 6.3.2 Benchmark / 106 6.3.3 Performances / 107 6.4 Conclusion / 110 Acknowledgments / 110 References / 111 7 A GENERIC APPROACH TO BIOLOGICAL SEQUENCE SEGMENTATION PROBLEMS: APPLICATION TO PROTEIN SECONDARY STRUCTURE PREDICTION 114 Yann Guermeur and Fabien Lauer 7.1 Introduction / 114 7.2 Biological Sequence Segmentation / 115 7.


3 MSVMpred / 117 7.3.1 Base Classifiers / 117 7.3.2 Ensemble Methods / 118 7.3.3 Convex Combination / 119 7.4 Postprocessing with A Generative Model / 119 7.


5 Dedication to Protein Secondary Structure Prediction / 120 7.5.1 Biological Problem / 121 7.5.2 MSVMpred2 / 121 7.5.3 Hidden Semi-Markov Model / 122 7.5.


4 Experimental Results / 122 7.6 Conclusions and Ongoing Research / 125 Acknowledgments / 126 References / 126 8 STRUCTURAL MOTIF IDENTIFICATION AND RETRIEVAL: A GEOMETRICAL APPROACH 129 Virginio Cantoni, Marco Ferretti, Mirto Musci, and Nahumi Nugrahaningsih 8.1 Introduction / 129 8.2 A Few Basic Concepts / 130 8.2.1 Hierarchy of Protein Structures / 130 8.2.2 Secondary Structure Elements / 131 8.


2.3 Structural Motifs / 132 8.2.4 Available Sources for Protein Data / 134 8.3 State of the Art / 135 8.3.1 Protein Structure Motif Search / 135 8.3.


2 Promotif / 136 8.3.3 Secondary-Structure Matching / 137 8.3.4 Multiple Structural Alignment by Secondary Structures / 138 8.4 A Novel Geometrical Approach to Motif Retrieval / 138 8.4.1 Secondary Structures Cooccurrences / 138 8.


4.2 Cross Motif Search / 143 8.4.3 Complete Cross Motif Search / 146 8.5 Implementation Notes / 149 8.5.1 Optimizations / 149 8.5.


2 Parallel Approaches / 150 8.6 Conclusions and Future Work / 151 Acknowledgment / 152 References / 152 9 GENOME-WIDE SEARCH FOR PSEUDOKNOTTED NONCODING RNAs: A COMPARATIVE STUDY 155 Meghana Vasavada, Kevin Byron, Yang Song, and Jason T.L. Wang 9.1 Introduction / 155 9.2 Background / 156 9.2.1 Noncoding RNAs and Their Secondary Structures / 156 9.


2.2 Pseudoknotted ncRNA Search Tools / 157 9.3 Methodology / 157 9.4 Results and Interpretation / 161 9.5 Conclusion / 162 References / 163 III PATTERN RECOGNITION IN TERTIARY STRUCTURES 165 10 MOTIF DISCOVERY IN PROTEIN 3D-STRUCTURES USING GRAPH MINING TECHNIQUES 167 Wajdi Dhifli and Engelbert Mephu Nguifo 10.1 Introduction / 167 10.2 From Protein 3D-Structures to Protein Graphs / 169 10.2.


1 Parsing Protein 3D-Structures into Graphs / 169 10.3 Graph Mining / 172 10.4 Subgraph Mining / 173 10.5 Frequent Subgraph Discovery / 173 10.5.1 Problem Definition / 174 10.5.2 Candidates Generation / 176 10.


5.3 Frequent Subgraph Discovery Approaches / 177 10.5.4 Variants of Frequent Subgraph Mining: Closed and Maximal Subgraphs / 178 10.6 Feature Selection / 179 10.6.1 Relevance of a Feature / 179 10.7 Feature Selection for Subgraphs / 180 10.


7.1 Problem Statement / 180 10.7.2 Mining Top-k Subgraphs / 180 10.7.3 Clustering-Based Subgraph Selection / 181 10.7.4 Sampling-Based Approaches / 181 10.


7.5 Approximate Subgraph Mining / 181 10.7.6 Discriminative Subgraph Selection / 182 10.7.7 Other Significant Subgraph Selection Approaches / 182 10.8 Discussion / 183 10.9 Conclusion / 185 Acknowledgments / 185 References / 186 11 FUZZY AND UNCERTAIN LEARNING TECHNIQUES FOR THE ANALYSIS AND PREDICTION OF PROTEIN TERTIARY STRUCTURES 190 Chinua Umoja, Xiaxia Yu, and Robert Harrison 11.


1 Introduction / 190 11.2 Genetic Algorithms / 192 11.2.1 GA Model Selection in Protein Structure Prediction / 196 11.2.2 Common Methodology / 198 11.3 Supervised Machine Learning Algorithm / 201 11.3.


1 Artificial Neural Networks / 201 11.3.2 ANNs in Protein Structure Prediction / 202 11.3.3 Support Vector Machines / 203 11.4 Fuzzy Application / 204 11.4.1 Fuzzy Logic / 204 11.


4.2 Fuzzy SVMs / 204 11.4.3 Adaptive-Network-Based Fuzzy Inference Systems / 205 11.4.4 Fuzzy Decision Trees / 206 11.5 Conclusion / 207 References / 208 12 PROTEIN INTER-DOMAIN LINKER PREDICTION 212 Maad Shatnawi, Paul D. Yoo, and Sami Muhaidat 12.


1 Introduction / 212 12.2 Protein Structure Overview / 213 12.3 Technical Challenges and Open Issues / 214 12.4 Prediction Assessment / 215 12.5 Current Approaches / 216 12.5.1 DomCut / 216 12.5.


2 Scooby-Domain / 217 12.5.3 FIEFDom / 218 12.5.4 Chatterjee et al. (2009) / 219 12.5.5 Drop / 219 12.


6 Domain Boundary Prediction Using Enhanced General Regression Network / 220 12.6.1 Multi-Domain Benchmark Data Set / 220 12.6.2 Compact Domain Profile / 221 12.6.3 The Enhanced Semi-Parametric Model / 222 12.6.


4 Training, Testing, and Validation / 225 12.6.5 Experimental Results / 226 12.7 Inter-Domain Linkers Prediction Using Compositional Index and Simulated Annealing / 227 12.7.1 Compositional Index / 228 12.7.2 Detecting the Optimal Set of Threshold Values Using Simulated Annealing / 229 12.


7.3 Experimental Results / 230 12.8 Conclusion / 232 References / 23.


To be able to view the table of contents for this publication then please subscribe by clicking the button below...
To be able to view the full description for this publication then please subscribe by clicking the button below...