Gives a comprehensive and systematic account of high-dimensional data analysis, including variable selection via regularization methods and sure independent feature screening methods. It is a valuable reference for researchers involved with model selection, variable selection, machine learning, and risk management.
"This book delivers a very comprehensive summary of the development of statistical foundations of data science. The authors no doubt are doing frontier research and have made several crucial contributions to the field. Therefore, the book offers a very good account of the most cutting-edge development. The book is suitable for both master and Ph.D. students in statistics, and also for researchers in both applied and theoretical data science. Researchers can take this book as an index of topics, as it summarizes in brief many significant research articles in an accessible way. Each chapter can be read independently by experienced researchers. It provides a nice cover of key concepts in those topics and researchers can benefit from reading the specific chapters and paragraphs to get a big picture rather than diving into many technical articles. There are altogether 14 chapters. It can serve as a textbook for two semesters. The book also provides handy codes and data sets, which is a great treasure for practitioners."~Journal of Time Series Analysis
"This text-collaboratively authored by renowned statisticians Fan (Princeton Univ.), Li (Pennsylvania State Univ.), Zhang (Rutgers Univ.), and Zhou (Univ. of Minnesota)-laboriously compiles and explains theoretical and methodological achievements in data science and big data analytics. Amid today's flood of coding-based cookbooks for data science, this book is a rare monograph addressing recent advances in mathematical and statistical principles and the methods behind regularized regression, analysis of high-dimensional data, and machine learning. The pinnacle achievement of the book is its comprehensive exploration of sparsity for model selection in statistical regression, considering models such as generalized linear regression, penalized least squares, quantile and robust regression, and survival regression. The authors discuss sparsity not only in terms of various types of penalties but also as an important feature of numerical optimization algorithms, now used in manifold applications including deep learning. The text extensively probes contemporary high-dimensional data modeling methods such as feature screening, covariate regularization, graphical modeling, and principal component and factor analysis. The authors conclude by introducing contemporary statistical machine learning, spanning a range of topics in supervised and unsupervised learning techniques and deep learning. This book is a must-have bookshelf item for those with a thirst for learning about the theoretical rigor of data science."~Choice Review, S-T. Kim, North Carolina A&T State University, August 2021