MBZUAI Nexus Speaker Series
Hosted by: Prof. Xiaosong Ma
"Storage technologies have entered the market with performance vastly superior to conventional storagedevices. This technology shift requires a complete rethinking of the software storage stack. In this talk I will give two examples of our work with Optane-based solid-state (block) devices that illustrate the need for and the benefit of a wholesale redesign. First, I will describe the KVell key-value (KV) store. The key observation underlying KVell is that conventional KV software on fast devices is bottlenecked by the CPU rather than by the device. KVell therefore focuses on minimizing CPU intervention. Second, I will describe the KVell+ OLTP/OLAP system built on top of KVell. The key underlying observation here is that these storage devices have become so fast that the conventional implementation of snapshot isolation – maintaining multiple versions – leads to intolerable space amplification. Kvell+ therefore processes versions as they are created. This talk describes joint work with Oana Balmau (McGill University), Khaled Elmeleegy (Coupang), Karan Gupta (Nutanix), Kimberley Keeton (Google), Baptiste Lepers (INRIA), Xiaoxiang Wu and Yuben Yang (Sydney)."
Hosted by: Prof. Chih-Jen Lin
"Online decision-making is the core engine behind intelligent systems that must learn from incomplete feedback and act in real-time, with ubiquitous applications ranging over adaptive recommendation system, e-commerce platform, autonomous vehicle navigation, and personalized healthcare assistance. To operate effectively, these agents must balance exploration against exploitation while navigating uncertainty and satisfying complex constraints. In this talk, I will present a research program for reliable and adaptive sequential decision-making, that bridges theoretical foundations with crucial real-world deployments. I will begin by briefly outlining decision-making in dynamic pricing under censored feedback, before extending this to various operational constraints like fairness, supply, and multi-stage bottlenecks. Then I will introduce ""Generative Online Learning"" as a combination of traditional decision-making framework with the emerging power of Generative AI, where agents strategically decide to either generate novel actions or select from the existing action list. I will demonstrate the impact of this framework through the architecture and deployment of a safe, adaptive maternal health chatbot. Finally, I will conclude with future directions in multi-party online learning, and adaptive in-context decision planning"
Hosted by: Prof. Mladen Kolar
"Many modern learning problems are studied in a proportional high-dimensional regime, where the feature dimension is of the same order as the sample size. In this talk, I will discuss how working in this regime affects both estimation and uncertainty quantification, and how we obtain useful and sharp characterizations for widely used estimators and algorithms. The first part will focus on ridge regression in linear models. We derive a distributional approximation for the ridge estimator via an associated Gaussian sequence model with “effective” noise and regularization parameters. This reduction provides a convenient way to analyze prediction and estimation risks and to support practical tuning rules, such as cross-validation and generalized cross-validation. It also yields a simple inference procedure based on a debiased ridge construction. The second part will take an algorithmic perspective. Instead of analyzing only the final empirical risk minimizer, we view gradient descent iterates as estimators along an optimization path. We characterize the distribution of the iterates and use this characterization to construct data-driven estimates of generalization error and debiased iterates for statistical inference, including in settings beyond linear regression. I will conclude with simulations that illustrate the practical implications for tuning and inference."
Hosted by: Prof. Mladen Kolar
"Large language models (LLMs) have transformed how we generate and process information, yet two foundational challenges remain: ensuring the authenticity of their outputs and accurately evaluating their true capabilities. In this talk, I argue that both challenges are fundamentally statistical problems, and that statistical thinking plays a central role in advancing reliable and principled research on large language models. I will present two lines of work that address these problems from a statistical perspective. The first part introduces a statistical framework for language watermarks, which embed imperceptible signals into model-generated text for provenance verification. By formulating watermark detection as a hypothesis testing problem, this framework identifies pivotal statistics, provides rigorous Type I error control, and derives optimal detection rules that are theoretically grounded and computationally efficient. It clarifies the theoretical limits of existing detection methods and guides the design of more robust and powerful detectors. The second part focuses on language model evaluation, where I study how to quantify the unseen knowledge that models possess but may not reveal through limited queries. I introduce a statistical pipeline, based on the smoothed Good–Turing estimator, to estimate the total amount of a model’s knowledge beyond what is observed in benchmark datasets. The findings reveal that even advanced LLMs often articulate only a fraction of their internal knowledge, suggesting a new perspective on evaluation and model competence. Together, these projects represent an ongoing effort to develop statistical foundations for trustworthy and reliable language models. This talk is based on the following works: https://arxiv.org/abs/2404.01245 https://arxiv.org/abs/2506.02058 and will briefly mention follow-up studies: https://arxiv.org/abs/2411.13868 https://arxiv.org/abs/2510.22007"
Hosted by: Prof. Eran Segal
"There is considerable interest in AI for health data science, driven by the rapid growth of available data and declining computational costs. The debate over when to use AI versus classical statistical methods in medical research is long-standing, but merits fresh consideration in light of major methodological advances and increased policy attention. AI-based approaches offer substantial opportunities, while recognising that we may be near the peak of the Gartner hype cycle for AI. Lei argues that AI and classical statistics are best suited to different scenarios and are often complementary. In some domains, AI is widely regarded as essential because of the complexity and multimodality of the data, which are frequently free-form. A key example is unstructured clinical text, where clinical reasoning and summarisation tasks are increasingly addressed by contemporary large language models, a class of generative AI. In domains where either AI or classical statistics could plausibly be used, combining the strengths of both approaches is often the most effective strategy. In this talk, Lei will illustrate how she has integrated AI, machine learning, and medical statistics in her research through worked examples (and her own paintings). The session has two parts: Part I: Large language models (LLMs) for risk prediction and clinical tasks Part II: Combining machine learning and medical statistics This talk is suitable for a mixed audience interested in data modelling and its application in real-world clinical settings."
Hosted by: Prof. Dezhen Song
Underactuated balance robots have more degrees of freedom than the number of control inputs and they perform the balancing and tracking tasks simultaneously, such as rotational inverted pendulums, bicycles and bipedal walkers, etc. The balancing task requires the robot to maintain its motion around unstable equilibrium points, while the tracking task requires following desired trajectories. In this talk, I first review the model-based control design of the underactuated balance robots. Balance equilibrium manifold is proposed to capture the external trajectory tracking and internal balance performance. I will then present a machine learning-based control for underactuated balance robots. Gaussian process is used to obtain the estimation of the systems dynamics and the learning process is obtained without need of prior physical knowledge nor successful balance demonstrations. Additional attractive property of the design includes the guaranteed stability and closed-loop performance. Experiments from a Furuta pendulum and a bikebot are used to demonstrate the performance of the learning-based control design. Finally, I will present a few mechatronic design and motion control applications of underactuated balance robots such as mobile manipulation with bikebot, autonomous bikebot with leg assistance, and autonomous vehicle ski-stunt maneuvers.
Hosted by: Yoshihiko Nakamura
For robots to move from automated tools to reliable collaborators, they must tightly couple perception, decision-making, and action. Today’s robotic systems rely heavily on deep learning for sensing and control, yet lack explicit reasoning, which limits robustness, interpretability, and trust in real-world deployment. My research addresses this gap by unifying learning-based perception with knowledge-based reasoning under a trustworthy by design framework. I will first present methods for embedding formal logic into neural models, enabling robots to learn from limited data while maintaining structured constraints that improve robustness and transparency. Building on this, I will show how neuro-symbolic integration allows robots to reason about human intent, anticipate goals, and plan task-oriented actions in unstructured, human-centered environments. Finally, I will introduce a training-free self-correction approach using generative models, aimed at reducing hallucinations and unsafe behavior in robotic decision pipelines. Together, these results point toward robotic agents that can be instructed, corrected, and trusted, systems that combine learning with explicit knowledge, adapt online to real-world uncertainty, and collaborate effectively with humans in everyday settings.
Hosted by: Prof. Natasa Przulj
Understanding the deep human past requires analytical frameworks capable of integrating diverse datasets and tracing long-term trajectories of cultural and environmental change. Archaeology—uniquely positioned at the intersection of material culture, ecology, and human behaviour—holds unparalleled potential to address these challenges. This talk presents a suite of pioneering studies in which artificial intelligence, network science, and complexity theory are applied to Eurasian archaeological datasets, offering the most robust quantitative framework to date for modelling cooperation, exchange, and cultural co-evolution. The first part of the talk focuses on the origins of metallurgy in the Balkans between the 6th and 3rd millennia BC, where copper production and circulation first took recognisable regional form. Using trace element and lead isotope analyses from 410 artefacts across c. 80 sites (6200–3200 BC), we apply seven community detection algorithms—including Louvain, Leiden, Spinglass, and Eigenvector methods—to reconstruct prehistoric copper-supply networks. These models reveal stable and meaningful supply communities that correlate strikingly with regional archaeological cultures such as Vinča, KGK VI and Bodrogkeresztúr. By critically evaluating algorithm performance on archaeological compositional data, this case study not only demonstrates the power of network science for reconstructing prehistoric exchange but also challenges the traditional, typology-based concept of “archaeological culture.” It exemplifies how AI and complexity science can rigorously decode patterns of cooperation, resource movement, and social boundaries in the deep past.


Seville, Spain
Hangzhou, China
Sharm El Sheikh, Egypt
Rotterdam, Netherlands
Vienna, Austria
United Kingdom
Vancouver, Canada 





