Farouk Nathoo

Canada Research Chair in Biostatistics for Spatial and High-Dimensional Data

Tier 2 - 2017-11-01
University of Victoria
Natural Sciences and Engineering Research Council


Research involves

Developing statistical methods and computational algorithms to analyze genomic and high-dimensional imaging data.

Research relevance

This research will result in better techniques for analyzing high-dimensional imaging data and new models for assessing genetic influences on brain function.

Creating New Tools to Understand Complex Biomedical Data

Thanks to modern techniques in biotechnology, the world is an explosion of data. This gives scientists a valuable tool for tackling many basic questions about the human brain, such as how it works, or how it may be influenced by genetic variations.

But the sheer size and complexity of these datasets poses challenges for scientists. As Canada Research Chair in Biostatistics for Spatial and High-Dimensional Data, Dr. Farouk Nathoo wants to develop new approaches that will help scientists process and understand this data.

Neuroimaging studies can involve large datasets that describe the brain’s anatomy, function and connectivity. Studies in imaging genomics, where the aim is to discover the genetic variants associated with the brain’s structure and function, involve a combined analysis of brain images and additional data from high-throughput genotyping (genotyping hundreds or thousands of individuals at a large number of markers in the genome). In high-throughput genotyping, it is not unusual to conduct more than one billion statistical tests.

Nathoo and his research team are developing new methods of statistical analysis where the number of measured variables is larger than the sample size, or so-called “high-dimensional imaging data.” They use large-scale models together with mathematical approximations and high-performance computing to integrate large datasets from different sources to better understand complex systems.

Ultimately, the goal of Nathoo’s research is to produce new tools to help researchers understand biomedical data and the real-world processes underlying it.