Research Focus Areas & Applications

Research Focus Areas

Bayesian Statistics

Bayesian statistics is an approach to inference based on the celebrated Bayes theorem (ca 1763). It combines one's prior information on the unknown parameters of a model with the observed data to form the so-called posterior distribution, which reflects the updated knowledge on the parameters. Our faculty's expertise span from the development of Bayesian statistical models for complex problems to the study of their theoretical guarantees to aspects of scalable implementation. Areas of particular interest include methodologies for variable selection and regularization, graphical models, probabilistic image analysis, multiscale modeling, network analysis, quantile regression and methods for massive data sets with complex dependence structures, including functional, time series and spatial data. Applied areas of interest include biomedical applications, neuroscience, finance and economics, engineering and industrial applications and material informatics.

Biostatistics and Bioinformatics

Biostatistics addresses statistical problems in biology, medicine and public health. This includes epidemiology, clinical trials, survival analysis, and biomedical imaging. Biological and medical problems that deal with large genetic datasets or the complex nature of how genes interact or communicate with each other and their environment come under bioinformatics, statistical genetics, or systems biology. Department faculty and joint faculty from M.D. Anderson Cancer Center are leaders in the development and application of biostatistical and genomic methods, many of which are motivated by modern problems involving big data, high dimensions and complex structures. Methods that can be used to address these problems include data integration, prediction, statistical machine learning, Bayesian modeling, causal inference and graphical models.

Data Science

Classical multivariate data analysis encompasses data understanding, data visualization, computational statistics, and optimization. Rice Statistics Department researchers have provided groundbreaking research in exploratory data analysis and nonparametric methodology. Advanced functional visualization facilitates discovery of the unexpected. New emphasis on computational efficiency and convex optimization permit analyses of big data. All of these ideas have evolved into modern Data Science, with its ability to formulate complex models to extract knowledge using interdisciplinary research in deep learning and data mining. Together with its partners in the Engineering School, the Department of Statistics is leading the way towards the next big discoveries.

Dependent Data

Dependent data is the term used when observations are collected in a way that constrains their randomization. Examples include time series and panel data, spatial and spatial-temporal methods and image analysis. Further, analyses based on advanced study designs such as stratified, cluster and cohort sampling require advanced methods in dependent data. Functional data is another type of dependent data. Rather than observing individual points, the data itself may be functions. Functional methods provide a modern solution to capitalize on the distinctive nature of dependent data. Stochastic processes are also central to methodological development in this arena. Faculty working in this area also address questions related to causal inference.

Foundations of Probability and Statistics

In Statistics there is often no single correct way to analyze a data set or answer a scientific question. Many methods of statistical inference are based on probability models, and finding an appropriate model depends on a deep understanding of probability theory. Theoretical statistics incorporates more than just probability modeling in its quest to squeeze all information from a data set. Researchers at Rice have contributed to understanding probability models and their application in numerous fields including finance, population genetics, and biological systems. They have also made breakthroughs in statistical methodologies for many new types of complex data sets that arise in environmental engineering, biomedical research, and other areas.

Multivariate Analysis, Machine Learning, Graphical Models

Data sets arising in artificial intelligence, machine learning, computational social science, genomics, and other areas are huge and varied. Statistical learning from massive and multidimensional data leverages powerful mathematical methods, including low- and high-dimensional graphical models and supervised and unsupervised learning methods dating back to the 1890s. The faculty of the statistics department has developed cutting-edge statistical learning methods for modern multidimensional data, with applications ranging from neuroscience to public health and social networks.

Nonparametrics

Faculty are developing nonparametric methods for situations where the data cannot be assumed to come from a distribution described by a fixed functional form controlled by a small number of parameters. This is common for modern big data scenarios characterized by large sample size, large number of variables, mixed variables, and complex multi-modal structure. Faculty in the Department address these challenges by developing novel data-driven and machine learning approaches, including density estimation, regression, latent variable methods, clustering, classification, neural map manifold learning, Gaussian Processes, Dirichlet processes, and Neural Networks. The Department’s research has contributed to the advancement of many applications using these methods, including cancer studies, neuroscience diagnosis from medical imagery, EEG, EMG, and other medical data; financial modelling; social science investigations; compositional analysis of planetary surfaces from hyperspectral imagery; discovery from astronomical imagery.

Probability and Stochastic Processes

Probability and stochastic processes provide the mathematical foundation for studying phenomena that evolve stochastically, often over time or space. Our faculty’s cutting edge research is focused on solving real-world problems which can be appropriately modelled in a stochastic framework. This focus permeates many subfields of stochastic analysis, including optimal stopping and optimal stochastic control for problems in mathematical finance and portfolio optimization, branching processes for the progression of cancerous tumors, optimal detection of hidden targets for military applications, quickest detection problems for deep space applications, and statistical inference for stochastic processes with long-memory.

Applications

Astronomy, and Earth and Planetary Remote Sensing

Spectrally highly resolved measurements of materials in image context –hyperspectral imagery -are an indispensable type of big data today. Modern hyperspectral sensors record repeatable information with unprecedented detail on geologic properties, environmental conditions, urban characteristics, plant species and health, and more (for planetary surfaces), and on the composition and kinematics of protoplanetary disks, where new planets are born, or giant molecular clouds where stars are born, among many other uses. Exploitation of the resulting high-dimensional, large data sets with extremely complex structure and often very few labeled samples poses new mathematical and statistical challenges. Work in the Department has been addressing these challenges by novel clustering, classification and regression methods in multidisciplinary collaborations and has made scientific advances in understanding Earth, Mars, Pluto, asteroids. Recent projects with Rice’s Physics and Astronomy Department have been focusing on Machine Intelligence tools for discovery from the world’s most advanced hyperspectral telescope, ALMA; and for Dark Matter search from the most advanced astroparticle detector experiment, XENON.

Erzsébet Merényi

Biomedical

From its beginnings in 1987, the Statistics Department was engaged in development of mathematical and statistical tools for cancer research, epidemics, cardiovascular medicine, medical imaging and others. This includes development of probability and statistical methodologies, computational algorithms, and specific applications. Strong collaborations with MD Anderson Biostatistics and, more recently, Bioinformatics and Computational Biology, as well as with Molecular and Human Genetics and other departments at Baylor College of Medicine, have resulted in lasting relationships and the generation of external support for research and the graduate training. One of the achievements is the Joint PhD Program in Biostatistics between Rice and MD Anderson, which will soon celebrate 20 years of existence. It is supported by an NCI T32 Training Grant. Department graduates in this area have successfully obtained positions as (a) faculty in top academic institutions, such Harvard, Johns Hopkins, Mayo Clinic, University of Michigan, MD Anderson, BCM, University of Manchester, and (b) senior researchers in GlaxoSmithKline, Sanofi and NASA Life Science, among other places.

Data/Statistical Engineering

Today’s statistical data scientists find the need to develop end-to-end scientifically sound statistical solutions to often complex and ill formed problems. The idea is not specific to an area, but rather a willingness to actively engage in the collaborative process of engineering innovative solutions. Examples from the statistics department in this area include: energy exploration and production; geosteering (the process of adjusting the drill’s direction in real time based on geological logging measurements); improved flood management; the Urban Data Platform (kinderudp.org) for the greater Houston area, among others.

Finance

The Department plays a key role in the study of computational finance through the Center for Computational Finance and Economic Systems (CoFES). CoFES is dedicated to the quantitative study of financial markets and their ultimate impact on society. CoFES represents Rice University’s commitment to this important area of intellectual inquiry, and is a cooperative effort between the George R. Brown School of Engineering, the School of Social Science and the Jesse H. Jones School of Business. Through research and education CoFES will advance the boundaries of modeling and computational science in this important arena. A key component of the center is the integration of probabilistic and statistical modeling for complex, multidisciplinary investigations. Rice University is well suited for this endeavor because of its exceptionally bright student body; its distinguished faculty in engineering, statistics, business and economics; its world-class resources in high-performance computing; and an unusually flexible and collegial environment in which to pursue interdisciplinary research and education.

Neuroscience and Neuroimaging

In the last two decades, the development of a number of innovative technologies has led to an improved understanding of the mechanisms underlying the functioning and disruption of the human brain. In this highly interdisciplinary area, Rice statisticians are developing new models and tools that can help clinicians understand, monitor, and augment brain processes. A particular focus has been in understanding the role that brain connectivity patterns play in neurological and mental health disorders. Rice faculty have developed new models and algorithmic tools to analyze complex data, including images of multi-modalities, omics data, times series, networks and trees. Such data-driven solutions provide insightful understanding of the principles that govern physiology, behavior, cognition, and neurodegenerative diseases.

Social Sciences

Research in the social sciences centers on the study of human behavior, social environments, and interpersonal relationships. Statistical tools are impactful for many broad and important fields in the social sciences, including economics, education, law, political science, international relations, and psychology, among many others. Our faculty offer expertise in core statistical methods for the social sciences, such as cluster analysis, factor models, longitudinal, time series, and functional data analysis, multivariate methods, network analysis, and survey sampling.

The emerging area of Urban Analytics, brings the best of statistical data science to sustainable development and cities. Focusing on residents of a community, urban analytics advances understanding of how people live, work, learn and play in their respective communities and often requires strong partnerships between academia, local governments and community leaders.