MBZUAI Nexus Speaker Series
Hosted by: Prof. Mladen Kolar
This study introduces Variational Automatic Relevance Determination (VARD), a novel approach for fitting sparse additive regression models in high-dimensional settings. VARD stands out by independently assessing the smoothness of each feature while precisely determining whether its contribution to the response is zero, linear, or nonlinear. Additionally, we present an efficient coordinate descent algorithm for implementing VARD. Empirical evaluations on both simulated and real-world datasets demonstrate VARD’s superior performance compared to alternative variable selection methods for additive models.
Hosted by: Mladen Kolar
Adjusting for confounding and imbalance when establishing statistical relationships is an increasingly important task, and causal inference methods have emerged as the most popular tool to achieve this. Causal inference has been developed mainly for regression relationships with scalar responses and also for distributional responses. We introduce here a general framework for causal inference when responses reside in general geodesic metric spaces, where we draw on a novel geodesic calculus that facilitates scalar multiplication for geodesics and the quantification of treatment effects through the concept of geodesic average treatment effect. Using ideas from Fréchet regression, we obtain a doubly robust estimation of the geodesic average treatment effect and results on consistency and rates of convergence for the proposed estimators. Examples and practical implementations include simulations and data illustrations for responses corresponding to compositional responses as encountered for U.S. statewise energy source data, where we study the effect of coal mining, network data corresponding to New York taxi trips, where the effect of the COVID-19 pandemic is of interest, and the studying the effect of Alzheimer's disease on connectivity networks.
Hosted by: Prof. Xiaosong Ma
"Identifying reasonably good plans to execute complex queries in large data systems is a crucial ingredient for a robust data management platform. The traditional cost-based query optimizer approach enumerates different execution plans for each individual query, assesses each plan based on its costs, and selects the plan that promises the lowest execution costs. However, as we all know, the optimal execution plan is not always selected, opportunities are missed, and complex analytical queries might not even work. Thus, query optimization for data systems is a highly active research area, with novel concepts being introduced continuously. The talk will discuss this research area by addressing three distinct themes. First, the talk shows the potential of optimizer improvements by sharing insights from a comprehensive and in-depth evaluation. Based upon this analysis, the talk introduces TONIC and FASTgres. TONIC is a novel cardinality estimation-free extension for generic SPJ query optimizers, revising operator decisions for arbitrary join paths based on learned query feedback. FASTgres is a context-aware classification strategy for steering existing optimizers using hint set prediction. Finally, the talk sheds light on PostBOUND, a novel optimizer development and benchmarking framework that enables rapid prototyping and common-ground comparisons, serving as a base for reproducible optimizer research."
Hosted by: Prof. Mladen Kolar
Tool-using LLM agents can be best understood as resource-constrained decision systems. Each run implicitly solves an operations problem: how to allocate scarce budget (tokens, latency, tool-call limits, and verification/judging compute) across planning, execution, recovery, and checking—under uncertainty about tool reliability, user intent, and when to stop. In this talk, I’ll connect modern agent design to classic OR ideas—sequential decision-making, budgeted optimization, scheduling, and robust objectives—and show how this framing leads to systems that are measurably more reliable, not just larger. I’ll walk through a unified set of results across three themes: (1) tool orchestration in realistic multi-tool environments, with evaluation designed to be diagnostic and trajectory-agnostic; (2) open-ended research agents evaluated via structured rubrics that surface systematic failure modes and make iteration scientific; and (3) cost-aware evaluation protocols, where debate/deliberation and budgeted stopping explicitly trade off accuracy against compute to trace a cost–accuracy frontier. Finally, I’ll discuss why small-model proxies (“analogs”) are a practical accelerator for this agenda: they enable faster experimentation on orchestration policies and evaluation designs at a fraction of the cost, while preserving the failure modes that matter. I’ll close with how these ideas translate into ongoing research collaborations with startups, developing deployable agents with explicit budgets, measurable guarantees, and clear reliability trade-offs.
Hosted by: Prof. Aziz Khan
Many tasks in biological and medical science can be modeled as Pattern Recognition tasks, and AI is playing more and more important roles in those tasks. With the enrichment of single-cell level high-throughput omics data, it is now even possible to build digital virtual cells with advanced AI foundation models. Prof. Xuegong Zhang has been one of the leading researchers in using AI for cutting-edge pattern recognition tasks in biology and medicine, and in prompting the concept and practices of developing AI virtual cell models. In this seminar, he will provide an overview of both the fields based on their own work in the past two decades, and discuss the future trends in AI biology and medicine.
Hosted by: Prof. Marcos Matabuena
Modern digital devices continuously record physiological signals such as heart rate and physical activity, generating rich but complex data that evolve over time and across individuals. This talk introduces flexible statistical frameworks that move beyond modeling averages to capture full outcome distributions and dynamic time patterns. By representing responses through quantile functions and allowing data‐driven transformations of time, the proposed methods provide a unified way to study how entire distributions change with covariates and over the course of daily life. These approaches enable more nuanced questions: not only how a typical heart rate responds to activity, but how variability, extremes, and temporal dynamics differ across individuals and contexts. Applications to continuously monitored wearable data demonstrate how the methods reveal interpretable features of human behavior and physiology, offering powerful tools for digital health research and personalized monitoring.
Hosted by: Prof. Chih-Jen Lin
"In physics, phenomena such as light propagation and Newtonian mechanics obey the principle of least action: the true trajectory is a stationary point of the Lagrangian. In our recent work [1], we investigated whether learning, too, follows a least-action principle. We model learning as stationary-action dynamics on information fields. Concretely, we derive classical learning algorithms as stationary points of information-field Lagrangians, recovering Bellman optimality from a reward-based Hamiltonian and Fisher-information–aware updates for estimation. This potentially yields a unifying variational view across reinforcement learning and supervised learning, and suggests optimisers with testable properties. Conceptually, it treats the training of a learning system as the dynamical evolution of a physical system in an abstract information space. Structure is also central to learning, enabling interventional reasoning and scientific understanding. Causality provides a framework for discovering structure from data under the hypothesis that causal mechanisms are independent. In earlier work [2], we formalise independent mechanisms as independent latent variables controlling each mechanism, and show how this perspective extends across effect estimation, counterfactual reasoning, representation learning, and reinforcement learning. Methodologically, in collaboration with Prior Labs, we developed Do-PFN [3], a pre-trained foundation model that performs in-context causal inference. This serves as a promising out-of-the-box tool for practitioners working across diverse scientific domains. References [1] Siyuan Guo and Bernhard Schölkopf. Physics of Learning: A Lagrangian Perspective to Different Learning Paradigms. arXiv preprint arXiv:2509.21049, 2025. [2] Siyuan Guo*, Viktor Tóth*, Bernhard Schölkopf, and Ferenc Huszár. Causal de Finetti: On the Identification of Invariant Causal Structure in Exchangeable Data. Advances in Neural Information Processing Systems (NeurIPS), 2023. [3] Jake Robertson*, Arik Reuter*, Siyuan Guo, Noah Hollmann, Frank Hutter, and Bernhard Schölkopf. Do-PFN: In-Context Learning for Causal Effect Estimation. Advances in Neural Information Processing Systems (NeurIPS), 2025. (Spotlight; acceptance rate 3.19%.)"
Hosted by: Prof. Natasa Przulj
Understanding the deep human past requires analytical frameworks capable of integrating diverse datasets and tracing long-term trajectories of cultural and environmental change. Archaeology—uniquely positioned at the intersection of material culture, ecology, and human behaviour—holds unparalleled potential to address these challenges. This talk presents a suite of pioneering studies in which artificial intelligence, network science, and complexity theory are applied to Eurasian archaeological datasets, offering the most robust quantitative framework to date for modelling cooperation, exchange, and cultural co-evolution. The first part of the talk focuses on the origins of metallurgy in the Balkans between the 6th and 3rd millennia BC, where copper production and circulation first took recognisable regional form. Using trace element and lead isotope analyses from 410 artefacts across c. 80 sites (6200–3200 BC), we apply seven community detection algorithms—including Louvain, Leiden, Spinglass, and Eigenvector methods—to reconstruct prehistoric copper-supply networks. These models reveal stable and meaningful supply communities that correlate strikingly with regional archaeological cultures such as Vinča, KGK VI and Bodrogkeresztúr. By critically evaluating algorithm performance on archaeological compositional data, this case study not only demonstrates the power of network science for reconstructing prehistoric exchange but also challenges the traditional, typology-based concept of “archaeological culture.” It exemplifies how AI and complexity science can rigorously decode patterns of cooperation, resource movement, and social boundaries in the deep past.
Hosted by: Prof. Zhiqiang Xu
In this talk, I will discuss the development of machine learning for combinatorial optimization, covering general methodology and especially generative models for AI4Opt. I will show how the idea of diffusion models could be introduced to solve the notoriously hard combinatorial problems. I will also share some forward-looking ideas on future research directions.
Hosted by: Muhammad Haris Khan
We spend a lot of time in training a network to recognize different but a fixed number of types of objects in a scene. If we are to induct new object classes subsequently in the recognition engine, should we be retraining the network from scratch again? Can we tweak the network so that it can incrementally learn new classes of object? Unfortunately, any attempt to incrementally learn new concepts may also lead to forgetting, often catastrophic, of previously learnt concepts. Similarly, can we also selectively forget a few concepts that may be required for socio-technical reasons? In this talk, we shall discuss how some of these objectives can be achieved.
Hosted by: Prof. Salman Khan
Recent advances in vision and language models have significantly improved 3D and 4D generation tasks. In this talk, I will present our latest research on 4D foundation models, view synthesis, real-time deformable 4D reconstruction from monocular video, reanimating deformable 3D reconstruction, and instant 4D scene inpainting.


Seville, Spain
Hangzhou, China
Sharm El Sheikh, Egypt
Rotterdam, Netherlands
Vienna, Austria
United Kingdom
Vancouver, Canada 





