Nouha Dziri

I’m a research scientist at Allen Institute for AI (AI2) working with Yejin Choi. Currently, I co-lead the safety and post-training effort at Ai2 to build (OLMo): a highly capable and truly open LLM to advance AI.

Besides this, I work on understanding the limits of Transformers and their inner workings. Check out “Faith and Fate” to learn about the limits of Transformers on reasoning tasks. I also work on studying alignment in LLMs; checkout “Roadmap to Pluralistic Alignment”, “Finegrained RLHF”, and “RewardBench” the first evaluation tool for reward models..

My work has been featured in TechCrunch, LeMonde, The Economist, and Science News.

I was fortunate to work with brilliant researchers in the field. I have worked with Siva Reddy at Mila/McGill, with Hannah Rashkin, Tal Linzen, David Reitter, Diyi Yang, and Tom Kwiatkowski at Google Research NYC and have worked with Alessandro Sordoni, and Goeff Gordon at Microsoft Research Montreal.

News

Oct 2024	WildTeaming and WildGuard got accepted at NeurIPS 2024. See you in Vancouver
Sep 2024	New blogpost about AI safety “Current Paradigms of LLMs Safety Alignment are superficial”
Sep 2024	New blogpost about o1 models and LLMs reasoning “Have o1 Models Cracked Human Reasoning?”
Aug 2024	Super excited that our workshop “System 2 Reasoning At Scale” was accepted to NeurIPS24, Vancouver! Mark your calendar for Dec 15, 2024!
Jul 2024	Check out our new safety moderation tool WildGuard: a state-of-the-art open tool for assessing safety risks, jailbreaks, and refusals in LLMs.
Jul 2024	New red-teaming method WildTeaming: an automatic red-teaming framework that discovers novel jailbreaks based on in-the-wild user-LLMs interactions.
Jul 2024	Check out my interview with Science News Magazine about “LLMs reasoning skills” featuring “Faith and Fate” and “Generative AI Paradox”.
Jul 2024	I will serve as a Demo Chair for NAACL 2025.
Jun 2024	I will serve as a Senior Area Chair for ACL 2025 in the area of Ethics, Bias, and Fairness.
Jun 2024	Check out my interview with LeMonde (equivalent of NYT in France) about hallucinations in LLMs.
May 2024	Invited Talk “What it can create, it may not understand: Studying the Limits of Transformers” at the University of Cambridge.
May 2024	I served as an Area Chair for COLM 2024 in the area of Safety in LLMs.