Hello!
My name is Jiahai (pronounced ji-YAH-hi).
I am a computer science Ph.D. student at UC Berkeley, where I am fortunate to be advised by Jacob Steinhardt and Stuart Russell.
You can reach me at fjiahai at berkeley dot edu
I have a broad research interest in interpretable AI: whether it’s designing neurosymbolic AI systems or investigating how neural networks represent meaning. I see this as a path towards safe and robust AI.
In 2023, I completed my undergraduated studies at MIT. I majored in computation + cognition and physics.
Before that, I spent most of my life in Singapore, where I’m happy to call home.
Publications / Preprints
How do Language Models Generalize to Facts They are Trained on?
Jiahai Feng, Stuart Russell, Jacob Steinhardt
pdf | Working draft
Monitoring Latent World States in Language Models with Propositional Probes
Jiahai Feng, Stuart Russell, Jacob Steinhardt
pdf | In submission
How do Language Models Bind Entities in Context?
Jiahai Feng, Jacob Steinhardt
pdf | ICLR 2024
Learning adaptive planning representations with natural langauge guidance
Lionel Wong, Pratyusha Sharma, Zachary Siegel, Jiahai Feng, Noa Korneev, Josh Tenenbaum, Jacob Andreas
pdf | ICLR 2024
Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks
Katherine Collins, Catherine Wong, Jiahai Feng, Megan Wei, Josh Tenenbaum
pdf | CogSci 2022
AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity
Silviu-Marian Udrescu, Andrew Tan, Jiahai Feng, Orisvaldo Neto, Tailin Wu, Max Tegmark
pdf | Neurips 2020