Outline of Research InterestsThis page outlines current research in my lab, along with a subset of relevant publications. Short Summary: My current research interests are in the probabilistic and statistical aspects of machine learning, with a focus on: (i) theory and applications of probabilistic methods (ii) development and analysis of learning methods that optimize complex risks (iii) scalable machine learning. My applied work is primarily in neuroimaging, with a focus on: (i) encoding and decoding models for fMRI (ii) estimators for (time-varying) brain networks (iii) network statistics and embeddings, among other topics. Learning with Complex RisksReal-world machine learning often requires complex evaluation metrics, many of which are non-decomposable e.g. AUC, F-measure. This is in contrast to decomposable metrics such as accuracy which are defined as an empirical average. Non-decomposability is the primary source of difficulty in theoretical analysis and efficient algorithms. We study predictive methods from first principles, and derive novel efficient and statistically consistent algorithms that result in improved empirical performance. Optimal classification with multivariate losses Large Scale and Structured Probabilistic InferenceData in scientific and commercial disciplines are increasingly characterized by high dimensions and relatively few samples. For such cases, a-priori knowledge gleaned from expertise and experimental evidence are invaluable for recovering meaningful models. In particular, knowledge of restricted degrees of freedom such as sparsity or low rank has become an important design paradigm, enabling the recovery of parsimonious and interpretable results, and improving storage and prediction efficiency for high dimensional problems. In Bayesian models, this structure is determined by the prior distribution. We are developing a variety of variational inference techniques that lead to scalable and accurate inference, particularly for high dimensional structured problems. We are also developing new techniques that capture structure in the inference rather than the prior, and show that leads to improved efficiency and performance in several cases. Information Projection and Approximate Inference for Structured Sparse Variables Learning with Aggregated DataExisting work in spatio-temporal data analysis invariably assumes data available as individual measurements with localized estimates. However, for many applications in econometrics, financial forecasting and healthcare, data is often only available as aggregates. Data aggregation presents severe mathematical challenges to learning and inference, and naive application of standard techniques is susceptible to the ecological fallacy. We have shown that in some cases, this aggregation procedure has only a mild effect. For other cases, we are developing a variety of tools that enable provably accurate predictive modeling with aggregated data, while avoiding unnecessary and error-prone data reconstruction. Frequency Domain Predictive Modeling with Aggregated Data Modeling Graph Data & Graph StatisticsUnlike vectors, graphs are not easily manipulated and compared using standard Euclidean techniques. This is particularly important in neuroscience applications where brain graph estimates are susceptible to noise and alignment issues. We are exploring novel approaches for data exploration and predictive modeling for graph data. We are especially interested in graphon based techniques that compactly capture asymptotic graph statistics and associated graph distances. We primarily use these tools to enable the scientific analysis of, and predictive modeling from brain networks. The Dynamics of Functional Brain Networks: Integrated Network States during Cognitive Task Performance Learning with Spatio-temporal DataSpatio-temporal data are ubiquitous in science and engineering applications. We are pursuing a variety of techniques for modeling such datasets with a focus on applications to brain imaging time series. Our methods include (i) extensions of Gaussian processes to jointly capture spatial and temporal smoothness (ii) structured decomposition models (iii) generative deep network models (variational autoencoders and generative adversarial networks). Bayesian structure learning for dynamic brain connectivity Scaling up Machine Learning SystemsIn collaboration with systems and infrastructure domain experts, we are developing new architectures for scaling up machine learning systems. Novel challenges include fault tolerance and predictability of resource requirements. The inherent noisiness and other natural statistical properties of machine learning problems enable our new approaches to system design. Selected Publications In Preparation Interpretable Machine LearningAs machine learning methods have become ubiquitous in human decision making, their transparency and interpretability have grown in importance. Interpretability is particularity important in domains where decisions can have significant consequences. Examples abound where interpretable models can reveal important but surprising patterns in the data that complex models obscure. We are currently studying exemplar-based interpretable modeling. This is motivated by studies of human reasoning which suggest that the use of examples (prototypes) is fundamental to the development of effective strategies for tactical decision-making. We are also exploring the application of structured sparsity and attention (with deep neural networks) for enabling interpretability. Examples are not Enough, Learn to Criticize! Criticism for Interpretable Machine Learning Sequential Decision Making & Interactive ProcessesHuman decision making is complex, and may depend on many different exogenous factors as well as the task at hand. Indeed biases such as anchoring, confirmation, framing, base rate fallacy, primacy, and recency, among others, are well documented in the literature. We consider such biases as features of an interactive process that impacts human decision-making. Two major challenges in dealing with human machine interactions are exponential complexity and transience. Processes with these characteristics are widespread e.g. in decision support systems from online advertising to education and healthcare. We are exploring data-driven techniques to infer and model their role in human-machine interaction – towards improving sequential decision making. Selected Publications In Preparation Learning to RankDetermining priorities over a set of items or choices, ranking items based on preferences, or scoring them such that some desired preference order is maintained, are fundamental activities in both science and industry. We are developing novel scalable techniques for learning to rank that elegantly manage the combinatorial complexity of the ranking space via monotonic retargeting. We have explored both standard ranking and collaborative ranking, and are currently investigating extensions to adaptive and online scenarios. Preference Completion from Partial Rankings |