Jiahai Feng

Hello!

My name is Jiahai (pronounced ji-YAH-hi).

I am a computer science Ph.D. student at UC Berkeley, where I am fortunate to be advised by Jacob Steinhardt and Stuart Russell.

You can reach me at fjiahai at berkeley dot edu

I have a broad research interest in interpretable AI: whether it’s designing neurosymbolic AI systems or investigating how neural networks represent meaning. I see this as a path towards safe and robust AI.

In 2023, I completed my undergraduate studies at MIT. I majored in computation + cognition and physics.

Before that, I spent most of my life in Singapore, where I’m happy to call home.

Publications / Preprints

Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts
Jiahai Feng, Stuart Russell, Jacob Steinhardt
pdf | Preprint

Monitoring Latent World States in Language Models with Propositional Probes
Jiahai Feng, Stuart Russell, Jacob Steinhardt
pdf | ICLR 2025

How do Language Models Bind Entities in Context?
Jiahai Feng, Jacob Steinhardt
pdf | ICLR 2024

Learning adaptive planning representations with natural langauge guidance
Lionel Wong, Pratyusha Sharma, Zachary Siegel, Jiahai Feng, Noa Korneev, Josh Tenenbaum, Jacob Andreas
pdf | ICLR 2024

Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks
Katherine Collins, Catherine Wong, Jiahai Feng, Megan Wei, Josh Tenenbaum
pdf | CogSci 2022

AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity
Silviu-Marian Udrescu, Andrew Tan, Jiahai Feng, Orisvaldo Neto, Tailin Wu, Max Tegmark
pdf | Neurips 2020