Editor Biographies xv List of Contributors xvii 1 Software Fault Localization: an Overview of Research, Techniques, and Tools 1 W. Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, Franz Wotawa, and Dongcheng li 1.1 Introduction 1 1.2 Traditional Fault Localization Techniques 14 1.2.1 Program Logging 14 1.2.2 Assertions 14 1.
2.3 Breakpoints 14 1.2.4 Profiling 15 1.3 Advanced Fault Localization Techniques 15 1.3.1 Slicing-Based Techniques 15 1.3.
2 Program Spectrum-Based Techniques 20 1.3.2.1 Notation 20 1.3.2.2 Techniques 21 1.3.
2.3 Issues and Concerns 27 1.3.3 Statistics-Based Techniques 30 1.3.4 Program State-Based Techniques 32 1.3.5 Machine Learning-Based Techniques 34 1.
3.6 Data Mining-Based Techniques 36 1.3.7 Model-Based Techniques 37 1.3.8 Additional Techniques 41 1.3.9 Distribution of Papers in Our Repository 45 1.
4 Subject Programs 47 1.5 Evaluation Metrics 50 1.6 Software Fault Localization Tools 53 1.7 Critical Aspects 58 1.7.1 Fault Localization with Multiple Bugs 58 1.7.2 Inputs, Outputs, and Impact of Test Cases 60 1.
7.3 Coincidental Correctness 63 1.7.4 Faults Introduced by Missing Code 64 1.7.5 Combination of Multiple Fault Localization Techniques 65 1.7.6 Ties Within Fault Localization Rankings 67 1.
7.7 Fault Localization for Concurrency Bugs 67 1.7.8 Spreadsheet Fault Localization 68 1.7.9 Theoretical Studies 70 1.8 Conclusion 71 Notes 73 References 73 2 Traditional Techniques for Software Fault Localization 119 Yihao Li, Linghuan Hu, W. Eric Wong, Vidroha Debroy, and Dongcheng li 2.
1 Program Logging 119 2.2 Assertions 121 2.3 Breakpoints 124 2.4 Profiling 125 2.5 Discussion 128 2.6 Conclusion 130 References 131 3 Slicing-Based Techniques for Software Fault Localization 135 W. Eric Wong, Hira Agrawal, and Xiangyu Zhang 3.1 Introduction 135 3.
2 Static Slicing-Based Fault Localization 136 3.2.1 Introduction 136 3.2.2 Program Slicing Combined with Equivalence Analysis 137 3.2.3 Further Application 138 3.3 Dynamic Slicing-Based Fault Localization 138 3.
3.1 Dynamic Slicing and Backtracking Techniques 144 3.3.2 Dynamic Slicing and Model-Based Techniques 145 3.3.3 Critical Slicing 148 3.3.3.
1 Relationships Between Critical Slices (CS) and Exact Dynamic Program Slices (DPS) 149 3.3.3.2 Relationship Between Critical Slices and Executed Static Program Slices 150 3.3.3.3 Construction Cost 150 3.3.
4 Multiple-Points Dynamic Slicing 151 3.3.4.1 BwS of an Erroneous Computed Value 152 3.3.4.2 FwS of Failure-Inducing Input Difference 152 3.3.
4.3 BiS of a Critical Predicate 154 3.3.4.4 MPSs: Dynamic Chops 157 3.3.5 Execution Indexing 158 3.3.
5.1 Concepts 159 3.3.5.2 Structural Indexing 161 3.3.6 Dual Slicing to Locate Concurrency Bugs 165 3.3.
6.1 Trace Comparison 165 3.3.6.2 Dual Slicing 168 3.3.7 Comparative Causality: a Causal Inference Model Based on Dual Slicing 173 3.3.
7.1 Property One: Relevance 174 3.3.7.2 Property Two: Sufficiency 175 3.3.8 Implicit Dependences to Locate Execution Omission Errors 177 3.3.
9 Other Dynamic Slicing-Based Techniques 179 3.4 Execution Slicing-Based Fault Localization 179 3.4.1 Fault Localization Using Execution Dice 179 3.4.2 A Family of Fault Localization Heuristics Based on Execution Slicing 181 3.4.2.
1 Heuristic I 182 3.4.2.2 Heuristic II 183 3.4.2.3 Heuristic III 185 3.4.
3 Effective Fault Localization Based on Execution Slices and Inter-block Data Dependence 188 3.4.3.1 Augmenting a Bad D (1) 189 3.4.3.2 Refining a Good D (1) 190 3.4.
3.3 An Incremental Debugging Strategy 191 3.4.4 Other Execution Slicing-Based Techniques in Software Fault Localization 193 3.5 Discussions 193 3.6 Conclusion 194 Notes 195 References 195 4 Spectrum-Based Techniques for Software Fault Localization 201 W. Eric Wong, Hua Jie Lee, Ruizhi Gao, and Lee Naish 4.1 Introduction 201 4.
2 Background and Notation 203 4.2.1 Similarity Coefficient-Based Fault Localization 204 4.2.2 An Example of Using Similarity Coefficient to Compute Suspiciousness 205 4.3 Insights of Some Spectra-Based Metrics 210 4.4 Equivalence Metrics 212 4.4.
1 Applicability of the Equivalence Relation to Other Fault Localization Techniques 217 4.4.2 Applicability Beyond Fault Localization 218 4.5 Selecting a Good Suspiciousness Function (Metric) 219 4.5.1 Cost of Using a Metric 219 4.5.2 Optimality for Programs with a Single Bug 220 4.
5.3 Optimality for Programs with Deterministic Bugs 221 4.6 Using Spectrum-Based Metrics for Fault Localization 222 4.6.1 Spectrum-Based Metrics for Fault Localization 222 4.6.2 Refinement of Spectra-Based Metrics 227 4.7 Empirical Evaluation Studies of SBFL Metrics 232 4.
7.1 The Construction of D â 234 4.7.2 An Illustrative Example 235 4.7.3 A Case Study Using D â 237 4.7.3.
1 Subject Programs 237 4.7.3.2 Fault Localization Techniques Used in Comparisons 238 4.7.3.3 Evaluation Metrics and Criteria 239 4.7.
3.4 Statement with Same Suspiciousness Values 240 4.7.3.5 Results 241 4.7.3.6 Effectiveness of D â with Different Values of â 247 4.
7.3.7 D â Versus Other Fault Localization Techniques 248 4.7.3.8 Programs with Multiple Bugs 251 4.7.3.
9 Discussion 255 4.8 Conclusion 261 Notes 262 References 263 5 Statistics-Based Techniques for Software Fault Localization 271 Zhenyu Zhang and W. Eric Wong 5.1 Introduction 271 5.1.1 Tarantula 272 5.1.2 How It Works 272 5.
2 Working with Statements 274 5.2.1 Techniques Under the Same Problem Settings 275 5.2.2 Statistical Variances 275 5.3 Working with Non-statements 283 5.3.1 Predicate: a Popular Trend 283 5.
3.2 BPEL: a Sample Application 285 5.4 Purifying the Input 286 5.4.1 Coincidental Correctness Issue 286 5.4.2 Class Balance Consideration 287 5.5 Reinterpreting the Output 288 5.
5.1 Revealing Fault Number 288 5.5.2 Noise Reduction 291 Notes 292 References 293 6 Machine Learning-Based Techniques for Software Fault Localization 297 W. Eric Wong 6.1 Introduction 297 6.2 BP Neural Network-Based Fault Localization 298 6.2.
1 Fault Localization with a BP Neural Network 298 6.2.2 Reduce the Number of Candidate Suspicious Statements 302 6.3 RBF Neural Network-Based Fault Localization 304 6.3.1 RBF Neural Networks 304 6.3.2 Methodology 305 6.
3.2.1 Fault Localization Using an RBF Neural Network 306 6.3.2.2 Training of the RBF Neural Network 307 6.3.2.
3 Definition of a Weighted Bit-Comparison-Based Dissimilarity 309 6.4 C4.5 Decision Tree-Based Fault Localization 309 6.4.1 Category-Partition for Rule Induction 309 6.4.2 Rule Induction Algorithms 310 6.4.
3 Statement Ranking Strategies 310 6.4.3.1 Revisiting Tarantula 310 6.4.3.2 Ranking Statements Based on C4.5 Rules 312 6.
5 Applying Simulated Annealing with Statement Pruning for an SBFL Formula 314 6.6 Conclusion 317 Notes 317 References 317 7 Data Mining-Based Techniques for Software Fault Localization 321 Peggy Cellier, Mireille Ducassé, Sébastien Ferré, Olivier Ridoux, and W. Eric Wong 7.1 Introduction 321 7.2 Formal Concept Analysis and Association Rules 324 7.2.1 Formal Concept Analysis 325 7.2.
2 Association Rules 327 7.3 Data Mining for Fault Localization 329 7.3.1 Failure Rules 329 7.3.2 Failure Lattice 331 7.4 The Failure Lattice for Multiple Faults 336 7.4.
1 Dependencies Between Faults 336 7.4.2 Example 341 7.5 Discussion 342 7.5.1 The Structure of the Execution Traces 342 7.5.2 Union Model 343 7.
5.3 Intersection Model 343 7.5.4 Nearest Neighbor 343 7.5.5 Delta Debugging 344 7.5.6 From the Trace Context to the Failure Context 344 7.
5.7 The Structure of Association Rules 345 7.5.8 Multiple Faults 345 7.6 Fault Localization Using N-gram Analysis 346 7.6.1 Background 347 7.6.
1.1 Execution Sequence 347 7.6.1.2 N-gram Analysis 347 7.6.1.3 Linear Execution Blocks 349 7.
6.1.4 Association Rule Mining 349 7.6.2 Methodology 350 7.6.3 Conclusion 353 7.7 Fault Localization for GUI Software Using N-gram Analysis 353 7.
7.1 Background 354 7.7.1.1 Representation of the GUI and Its Operations 354 7.7.1.2 Event Handler 356 7.
7.1.3 N-gram 356 7.7.2 Association Rule Mining 357 7.7.3 Methodology 357 7.7.
3.1 General Approach 358 7.7.3.2 N-gram Fault Localization Algorithm 358 7.8 Conclusion 360 Notes 361 References 361 8 Information Retrieval-Based Techniques for Software Fault Localization 365 Xin Xia and David Lo 8.1 Introduction 365 8.2 General IR-Based Fault Localization Process 368 8.
3 Fundamental Information Retrieval Techniques for Software Fault Localization 369 8.3.1 Vector Space Model 369 8.3.2 Topic Modeling 370