The advent of high-dimensional statistics has opened up exciting opportunities in science and engineering. In biology, for example, large and complex data sets generated by high-throughput technologies can only be analyzed by means of high-dimensional statistics. However, Lasso, Ridge Regression, Graphical Lasso, and other standard methods include penalty terms that pose a major problem in practice, as they require tuning parameters that are difficult to calibrate. In this talk, I present two novel approaches to overcome this problem. My first approach provides calibration of tuning parameters for high-dimensional estimation and prediction with the Lasso and other standard methods. This calibration scheme is inspired by Lepskiās idea for bandwidth selection in non-parametric statistics and is to date the only scheme that ensures fast computations, optimal finite sample guarantees, and highly competitive practical performances. My second approach provides model selection via the minimization of an objective function that avoids tuning parameters altogether. Besides yielding accurate variable selection in standard regression settings, this approach is an effective basis for modeling gene regulation networks, microbial ecosystems, and many other applications.
Speaker: |
---|