Interests
Deep Learning
Transfer Learning
Dimension Reduction
Sequence Modeling
Dynamical Systems
I am a Research Scientist at the Lawrence Berkeley National Laboratory. I also lead the Deep Learning Group at the International Computer Science Institute (ICSI), an affiliated institute of UC Berkeley. Prior to this role, I was an Assistant Professor (Tenure-Track) for Data-driven Modeling and Artificial Intelligence in the Department of Mechanical Engineering and Materials Science at the University of Pittsburgh, from September 2021 to December 2022. Before joining Pitt, I was a postdoctoral researcher in the Department of Statistics at UC Berkeley, where I worked with Michael Mahoney. I was also part of the RISELab in the Department of Electrical Engineering and Computer Sciences (EECS) at UC Berkeley. Before that, I was a postdoc in the Department of Applied Mathematics at the University of Washington (UW), working with Nathan Kutz and Steven Brunton. I earned my PhD in Statistics at the University of St Andrews in December 2017. My MSc. in Applied Statistics is also from the University of St Andrews.
I am broadly interested in understanding what makes deep learning systems work, and how we can make them more robust, secure, and efficient. My research combines scientific machine learning, dynamical systems, generative modeling, and AI safety, with an emphasis on building systems that are reliable under real-world deployment conditions.
AI Safety and Security. My group develops methods to understand and mitigate adversarial vulnerabilities in large language models, including jailbreaking, prompt injection, and backdoor attacks. We are especially interested in emerging risks introduced by multi-turn and multi-agent workflows, tool use, long-context reasoning, and rapidly improving model capabilities.
Diffusion and Flow Matching Models. My group develops methods to better understand and improve the efficiency, reliability, and controllability of diffusion and flow-matching models. We develop methodology to improve inference-time efficiency and sample fidelity, as well as methodology for uncertainty quantification. We are also interested in combining diffusion models with language models for scientific reasoning and prediction.
Sequential and Dynamical Systems Models. My group develops models and methods for processing sequential data, with a particular interest in continuous-time formulations, state-space models, linear attention mechanisms, and neural architectures inspired by dynamical systems. Viewing neural networks through this lens helps characterize long-term behavior, stability, and inductive bias, and points toward new architectures informed by numerical integration, stochastic differential equations, and scientific computing.
AI for Science Applications. My group applies these methods to problems such as super-resolution, spatio-temporal forecasting, and conditional generation in scientific domains ranging from earth science and fluid dynamics to materials science and bioinformatics.
Two papers accepted in ICML 2026
PRISM: Distribution-free Adaptive Computation of Matrix Functions for Accelerating Neural Network Training (preprint).
D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting
Two papers accepted in AISTATS 2026
I am co-organizing the Deep Learning for Science Summer School (DL4Sci 26)
I will serve as an area chair @ ICLR 2026, @ ICML 2026, and @ NeurIPS 2026.
One paper accepted in NeurIPS 2025
Block-Biased Mamba for Long-Range Sequence Processing (preprint)
One paper accepted in ICML 2025
Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection (preprint)
Two papers accepted in ICLR 2025 (one as spotlight)
I will serve as an area chair @ ICLR 2025, @ ICML 2025, and @ NeurIPS 2025.
One paper accepted in AISTATS 2025
Gated Recurrent Neural Networks with Weighted Time-Delay Feedback (preprint).
Two papers accepted in ICLR 2024 (one as spotlight)
Robustifying State-space Models for Long Sequences via Approximate Diagonalization.
Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs.
Yihan Wang, Visiting PhD Student
Aditi Gupta, ML Engineer
Ben Hung, Undergrad Researcher
Ananya Gupta, Undergrad Researcher
Yotam Yaniv(Postdoc 2024-26; now ML Engineer at Google).
Cici Wang (Undergraduate research intern 2023-25; now graduate student at U Chicago).
Garry Gao (Graduate researcher 2023-24; now Software Engineer at Amazon).
Junyi Guo (Graduate researcher 2022-24; now PhD student at University of Notre Dame).
Jialin Song (Graduate researcher 2021-24; now PhD student at Simon Fraser University).
Yixiao Kang (Graduate researcher 2023-24; now ML Engineer at Meta).
Daniel Barron (Graduate researcher 2023-24; now Software Engineer at Amazon).
Olina Mukherjee (High School student researcher, 2021-22; now undergraduate student at CMU).
Ziang Cao (Undergraduate research intern 2020-21; now graduate student at Stanford).
Francisco Utrera (Graduate researcher 2019-22; now Senior ML Engineer at Erithmitic) .
Evan Kravitz (Graduate researcher 2019-20; now Software Engineer at Amazon) .
Vanessa Lin (Undergraduate research intern 2018-19; now at Google).
Qixuan Wu (Undergraduate research intern 2018-19; now at Goldman Sachs).