Multimodal Machine learning is one of the fastest growing areas of machine learning. Often seen as one of the holy grails of AI, it is concerned with joint modeling of multiple modalities to better model natural phenomena. My research in this area includes finding new theoretical justification of multimodal learning, as well as novel empirical methods for fusion, alignment, and co-learning.
Theoretical and Empirical Foundations of Multimodality
Neural Models of Fusion
Fast Stochastic Inference Models under Uncertainty
Multimodal Active Learning, Meta-Learning and Reinforcement Learning
Machine Learning and Multimodality
Allowing neural models to build priors and beliefs from natural interactions is among core challenges in AI. Communication, reasoning and interaction with humans and environment require in-depth studies of computational models, as well as novel resources.
Multimodal Language Modeling
Artificial Social Intelligence
Multimodal Casual Prediction
Multimodal Communication and Reasoning
Multimodal sensing is an essential part of AI. From advanced neural models, to explainable statistics, multimodal sensing bridges the gap between the realm of AI and the real world. My research in this area is focused on visual sensing of human face and body, recognition of auditory cues, as well as social perceptions. A particular area of my research is exporting such technology to low-resource robots and environments.