MBZUAI Nexus Speaker Series
Hosted by: Prof. Chih-Jen Lin
How can we design learning systems that resemble the brain—able to adapt continually, learn from streams, and generalize without a flood of labeled data? This talk explores recent advances in sparse and modular neural networks that push machine learning in that direction. By selecting only the most informative experiences from a stream, enforcing sparsity to balance stability and plasticity, and leveraging modular structure to reduce interference and improve efficiency, we can move toward models that learn more like animals and humans. The focus is not on scaling up to larger black boxes, but on rethinking how learning itself happens under constraints. The result is a neuro-inspired agenda for machine learning that emphasizes adaptability, efficiency, and robustness in open-ended environments.
Hosted by: Prof. Elizabeth Churchill
Curious about how we can design AI systems that truly center human values? This talk introduces Bidirectional Human-AI Alignment, which posits alignment as a dynamic, mutual process that goes beyond simply integrating human goals into AI. By balancing AI-centered and human-centered perspectives, we can preserve human agency, foster critical engagement, and adapt societal approaches to AI that benefit humanity. To ground the discussion, we will look at case study of how AI is being used to support healthcare decision making.
Hosted by: Prof. Elizabeth Churchill
This talk presents a retrospective on my research into “prompt engineering” for text-to-image (TTI) generation – an example where humans were creatively empowered by generative AI. I trace how online communities were instrumental in shaping the practice of prompting and how challenges persist to this day in the creative use of TTI systems. While TTI generative systems enable anyone to produce digital images and artworks through language, this apparent democratization conceals deeper issues of control, authorship, and alignment. I argue that prompt engineering is not merely a creative technique but a symptom of a broader misalignment between human intent and system behavior. Extending this lens, I discuss how prompting has diffused into the wider research field of Human-Computer Interaction (HCI), where it risks fostering tool-driven novelty at the expense of conceptual progress and meaningful insight. What is harmful is not that prompting fails to translate human intent efficiently, but that it is brittle and encodes a mode of interaction that prioritizes prompt tuning and short-lived prototyping over deeper understanding. I conclude by outlining a vision for reflective and scalable stewardship in HCI research.
Hosted by: Prof. Preslav Nakov
"The deployment of large language models at scale presents fundamental challenges in computational efficiency, system architecture, and resource allocation. In this talk, I present a comprehensive research agenda addressing these challenges through three interconnected contributions that span the full stack of LLM system optimization. First, I introduce Archon, a modular multi-agent framework that achieves state-of-the-art performance by orchestrating specialized LLMs through automated agent selection and inference-time fusion. Our system demonstrates a 15.1 percentage point improvement over GPT-4o on MT-Bench, establishing new state-of-the-art performances for multi-agent architectures while maintaining computational tractability through strategic model selection algorithms. Second, I present Weaver, a framework that addresses the critical challenge of verifying LLM-generated responses at test time. By combining multiple weak verifiers—including reward models and LM judges—through weak supervision techniques adapted from data labeling, Weaver significantly improves response selection without requiring ground truth labels. Our system achieves 87.7% average accuracy across reasoning and mathematics tasks, matching the performance of OpenAI's o3-mini with a non-reasoning Llama 3.3 70B model. Through distillation to a 400M-parameter cross-encoder, we retain 98.2% of Weaver's performance gains while reducing verification compute by 99.97%, demonstrating how strategic aggregation of imperfect verifiers can enable scalable, label-efficient verification for inference-time decision-making. Finally, I introduce TrafficBench, the first comprehensive benchmark for local-cloud LLM routing, comprising 1M real-world queries evaluated across 10 models and 4 hardware accelerators. Our analysis reveals that 80.7% of production inference workloads can be served by sub-20B parameter models on edge devices. Leveraging these insights, we develop novel routing algorithms that reduce energy consumption by 77%, computational requirements by 67%, and operational costs by 60% while maintaining task accuracy within 5% of frontier models. These contributions collectively demonstrate how principled system design can transform the deployment landscape of large language models, making them simultaneously more capable, efficient, and accessible for real-world applications."
Hosted by: Prof. Elizabeth Churchill
"Designing the next generation of human-computer interactions requires a deeper understanding of how cognition unfolds in context, shaped not only by the user’s mental and bodily states but also by their dynamic interaction with the surrounding environment. In this talk, I present a research agenda that brings together cognitive neuroscience, brain-computer interfaces (BCIs), and wearable sensing to inform the design of ubiquitous, adaptive, and unobtrusive interactive systems. Using tools such as mobile EEG, eye-tracking, motion sensors, and environment-aware computing, my work investigates how people perceive, act, and make decisions in natural settings, from high-load operational tasks such as flying a plane to everyday behaviors like walking around a city or eating a meal. This approach moves beyond screen-based interaction to develop systems that respond to users in real time, based on the continuous coupling between brain, body, and environment. By embedding cognitive and contextual awareness into system design, we can move toward calm, seamless technologies that adapt fluidly to the user’s moment-to-moment needs."
Hosted by: Prof. Natasa Przulj
The rapid growth of open-access omics data has enabled large-scale exploration of cellular states across species, tissues, and molecular modalities. Building on these resources, cellular foundation models use self-supervised learning to derive general cell representations that can be adapted to diverse downstream biological tasks, including the prediction of responses to chemical and genetic perturbations. This presentation reviews their use in modeling cellular perturbations, describing common learning frameworks, data requirements, and evaluation practices, as well as key challenges specific to single-cell data. We note emerging gaps between reported results and standardized evaluations, which highlight persistent issues in how performance is quantified across studies and benchmarks. Overall, this presentation provides an overview of the current landscape of single-cell foundation models, emphasizing both their progress and limitations in capturing perturbation-specific responses.
Hosted by: Prof. Preslav Nakov
To move beyond tools and towards true partners, AI systems must bridge the gap between perception-driven deep learning and knowledge-based symbolic reasoning. Current approaches excel at one or the other, but not both, limiting their reliability and preventing us from fully trusting them. My research addresses this challenge through a principled fusion of learning and reasoning, guided by the principle of building AI that is "Trustworthy by Design." I will first describe work on embedding formal logic into neural networks, creating models that are not only more robust and sample-efficient, but also inherently more transparent. Building on this foundation, I will show how neuro-symbolic integration enables robots to reason about intent, anticipate human needs, and perform task-oriented actions in unstructured environments. Finally, I will present a novel training-free method that leverages generative models for self-correction, tackling the critical problem of hallucination in modern AI. Together, these contributions lay the groundwork for intelligent agents that can be instructed, corrected, and ultimately trusted, agents that learn from human knowledge, adapt to real-world complexity, and collaborate seamlessly with people in everyday environments.

December 15 – 18, 2025 — MBZUAI Reception at International Conference on Statistics and Data Science (ICSDS) 2025
Hangzhou, China
Sharm El Sheikh, Egypt
Rotterdam, Netherlands
Vienna, Austria
United Kingdom
Vancouver, Canada 





