Epigraph Preface xiii Preface to the Second Edition xiii Preface to the First Edition xv Acknowledgements from First Edition xviii Notation, abbreviations and key ideas xix 1 Introduction 1 1.1 Objects and Variables 1 1.2 Some Multivariate Problems and Techniques 2 1.2.1 Generalizations of univariate techniques 2 1.2.2 Dependence and regression 2 1.2.
3 Linear combinations 2 1.2.4 Assignment and dissection 5 1.2.5 Building configurations 6 1.3 The Data Matrix 6 1.4 Summary Statistics 7 1.4.
1 The mean vector and covariance matrix 8 1.4.2 Measures of multivariate scatter 11 1.5 Linear Combinations 11 1.5.1 The scaling transformation 12 1.5.2 Mahalanobis transformation 12 1.
5.3 Principal component transformation 12 1.6 Geometrical Ideas 13 1.7 Graphical Representation 14 1.7.1 Univariate scatters 14 1.7.2 Bivariate scatters 16 1.
7.3 Harmonic curves 16 1.7.4 Parallel coordinates plot 19 1.8 Measures of Multivariate Skewness and Kurtosis 19 2 Basic Properties of Random Vectors 27 2.1 Cumulative Distribution Functions and Probability Density Functions 27 2.2 Population Moments 29 2.2.
1 Expectation and correlation 29 2.2.2 Population mean vector and covariance matrix 29 2.2.3 Mahalanobis space 31 2.2.4 Higher moments 31 2.2.
5 Conditional moments 33 2.3 Characteristic Functions 33 2.4 Transformations 35 2.5 The Multivariate Normal Distribution 36 2.5.1 Definition 36 2.5.2 Geometry 38 2.
5.3 Properties 38 2.5.4 Singular multivariate normal distribution 43 2.5.5 The matrix normal distribution 44 2.6 Random Samples 45 2.7 Limit Theorems 47 3 Non-normal Distributions 53 3.
1 Introduction 53 3.2 Some Multivariate Generalizations of Univariate Distributions 53 3.2.1 Direct generalizations 53 3.2.2 Common components 54 3.2.3 Stochastic generalizations 55 3.
3 Families of Distributions 56 3.3.1 The exponential family 56 3.3.2 The spherical family 57 3.3.3 Elliptical distributions 60 3.3.
4 Stable distributions 62 3.4 Insights into skewness and kurtosis 62 3.5 Copulas 64 3.5.1 The Gaussian Copula 66 3.5.2 The Clayton-Mardia copula 67 3.5.
3 Archimedean Copulas 68 3.5.4 Fr´echet-H¨offding Bounds 69 4 Normal Distribution Theory 77 4.1 Characterization and Properties 77 4.1.1 The central role of multivariate normal theory 77 4.1.2 A definition by characterization 78 4.
2 Linear Forms 79 4.3 Transformations of Normal Data Matrices 81 4.4 The Wishart Distribution 83 4.4.1 Introduction 83 4.4.2 Properties of Wishart matrices 83 4.4.
3 PartitionedWishart matrices 86 4.5 The Hotelling T 2 Distribution 89 4.6 Mahalanobis Distance 92 4.6.1 The two-sample Hotelling T 2 statistic 92 4.6.2 A decomposition of Mahalanobis distance 93 4.7 Statistics Based on the Wishart Distribution 95 4.
8 Other Distributions Related to the Multivariate Normal 99 5 Estimation 111 5.1 Likelihood and Sufficiency 111 5.1.1 The likelihood function 111 5.1.2 Efficient scores and Fisher''s information 112 5.1.3 The Cram´er-Rao lower bound 114 5.
1.4 Sufficiency 115 5.2 Maximum Likelihood Estimation 116 5.2.1 General case 116 5.2.2 Multivariate normal case 117 5.2.
3 Matrix normal distribution 122 5.3 Robust Estimation of Location and Dispersion for Multivariate Distributions 123 5.3.1 M-Estimates of location 123 5.3.2 Minimum covariance determinant 124 5.3.3 Multivariate trimming 124 5.
3.4 Stahel-Donoho estimator 125 5.3.5 Minimum volume estimator 125 5.3.6 Tyler''s estimate of scatter 127 5.4 Bayesian inference 127 6 Hypothesis Testing 137 6.1 Introduction 137 6.
2 The Techniques Introduced 139 6.2.1 The likelihood ratio test (LRT) 139 6.2.2 The union intersection test (UIT) 143 6.3 The Techniques Further Illustrated 146 6.3.1 One-sample hypotheses on μ 146 6.
3.2 One-sample hypotheses on _ 148 6.3.3 Multi-sample hypotheses 152 6.4 Simultaneous Confidence Intervals 156 6.4.1 The one-sample Hotelling T 2 case 156 6.4.
2 The two-sample Hotelling T 2 case 157 6.4.3 Other examples 157 6.5 The Behrens-Fisher Problem 157 6.6 Multivariate Hypothesis Testing: Some General Points 158 6.7 Non-normal Data 159 6.8 Mardia''s Non-parametric Test for the Bivariate Two-sample Problem 162 7 Multivariate Regression Analysis 169 7.1 Introduction 169 7.
2 Maximum Likelihood Estimation 170 7.2.1 Maximum likelihood estimators for B and _ 170 7.2.2 The distribution of ^B and ^_ 172 7.3 The General Linear Hypothesis 173 7.3.1 The likelihood ratio test (LRT) 173 7.
3.2 The union intersection test (UIT) 175 7.3.3 Simultaneous confidence intervals 175 7.4 Design Matrices of Degenerate Rank 176 7.5 Multiple Correlation 178 7.5.1 The effect of the mean 178 7.
5.2 Multiple correlation coefficient 178 7.5.3 Partial correlation coefficient 180 7.5.4 Measures of correlation between vectors 181 7.6 Least Squares Estimation 182 7.6.
1 Ordinary least squares (OLS) estimation 182 7.6.2 Generalized least squares 183 7.6.3 Application to multivariate regression 183 7.6.4 Asymptotic consistency of least squares estimators 184 7.7 Discarding of Variables 184 7.
7.1 Dependence analysis 184 7.7.2 Interdependence analysis 186 8 GraphicalModels 195 8.1 Introduction 195 8.2 Graphs and Conditional independence 196 8.3 Gaussian Graphical Models 201 8.3.
1 Estimation 202 8.3.2 Model selection 207 8.4 Log-linear Graphical Models 208 8.4.1 Notation 209 8.4.2 Log-linear models 210 8.
4.3 Log-linear models with a graphical interpretation 213 8.5 Directed and Mixed Graphs 215 9 Principal Component Analysis 221 9.1 Introduction 221 9.2 Definition and Properties of Principal Components 221 9.2.1 Population principal components 221 9.2.
2 Sample principal components 224 9.2.3 Further properties of principal components 225 9.2.4 Correlation structure 229 9.2.5 The effect of ignoring some components 229 9.2.
6 Graphical representation of principal components 232 9.2.7 Biplots 232 9.3 Sampling Properties of Principal Components 236 9.3.1 Maximum likelihood estimation for normal data 236 9.3.2 Asymptotic distributions for normal data 239 9.
4 Testing Hypotheses about Principal Components 242 9.4.1 Introduction 242 9.4.2 The hypothesis that (_1 + · · · + _k)/(_1 + · · · + _p) = 244 9.4.3 The hypothesis that (p â k) eigenvalues of _ are equal 245 9.4.
4 Hypotheses concerning correlation matrices 246 9.5 Correspondence Analysis 247 9.5.1 Contingency tables 247 9.5.2 Gradient analysis 253 9.6 Allometry-- the Measurement of Size and Shape 255 9.7 Discarding of variables 258 9.
8 Principal Component Regression 259 9.9 Projection Pursuit and Independent Component Analysis 261 9.9.1 Projection pursuit 261 9.9.2 Independent component analysis 263 9.10 PCA in high dimensions 266 10 Factor Analysis 277 10.1 Introduction 277 10.
2 The Factor Model 278 10.2.1 Definition 278 10.2.2 Scale invariance 279 10.2.3 Non-uniqueness of factor loadings 279 10.2.
4 Estimation of the parameters in factor analysis 280 10.2.5 Use of the correlation matrix R in estimation 281 10.3 Principal Factor Analysis 282 10.4 Maximum Likelihood Factor Analysis 284 10.5 Goodness of Fit Test 287 10.6 Rotation of Factors 288 10.6.
1 Interpretation of factors 288 10.6.2 Varimax rotation 289 10.7 Factor Scores 293 10.8 Relationships Between Factor Analysis and Principal Component Analysis 294 10.9 Analysis of Covariance Structures 295 11 Canonical Correlation Analysis 299 11.1 Introduction 299 11.2 Mathematical Development 300 11.
2.1 Population canonical correlation analysis 300 11.2.2 Sample canonical correlation analysis 304 11.2.3 Sampling properties and tests 305 11.2.4 Scoring and prediction 306 11.
3 Qualitative Data and Dummy Variables 307 11.4 Qualitative and Quantitative Data 309 12 Discriminant Analysis and Statistical Learning 317 12.1 Introduction 317 12.2 Bayes'' Discriminant Rule 319 12.3 The error rate 320 12.3.1 Probabilities of misclassification 320 12.3.
2 Estimation of error rate 323 12.3.3 Confusion matrix 324 12.4 Discrimination Using the Normal Distribution 324 12.4.1 Population discriminant rules 324 12.4.2 The sample discriminant rules 326 12.
4.3 Is discrimination worthwhile? 334 12.5 Discarding of Variables 334 12.6 Fisher''s Linear Discriminant Function 336 12.7 Nonparametric Distance-based Methods 339 12.7.1 Nearest neighbor classifier 339 12.7.
2 Large sample behavior of the Nearest Neighbor Classifier 341 12.7.3 Kernel classifiers 344 12.8 Classification Trees 346 12.8.1 Splitting criteria 348 12.8.2 Pruning 351 12.
9 Logistic Discrimination 354 12.9.1 Logistic regression model 354 12.9.