Publications

publications by categories in reversed chronological order.

2024

  1. wildguard.png
    Wildguard: Open one-stop moderation tools for safety risks, jailbreaks, and refusals of LLMs
    Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, and Nouha Dziri
    In Arxiv, Jul 2024
  2. wildteaming.png
    WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
    Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, and Nouha Dziri
    In Arxiv, Jul 2024
  3. wildbench.png
    WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
    Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu, Faeze Brahman, Abhilasha Ravichander, Valentina Pyatkin, Nouha Dziri, Ronan Le Bras, and Yejin Choi
    In Arxiv, Jun 2024
  4. reward.png
    Rewardbench: Evaluating reward models for language modeling
    Nathan Lambert, Valentina Pyatkin, Jacob Morrison, LJ Miranda, Bill Yuchen Lin, Khyathi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A Smith, and Hannaneh Hajishirzi
    In Arxiv, Mar 2024
  5. plua.png
    A roadmap to pluralistic alignment
    Taylor Sorensen, Jared Moore, Jillian Fisher, Mitchell Gordon, Niloofar Mireshghallah, Christopher Michael Rytting, Andre Ye, Liwei Jiang, Ximing Lu, Nouha Dziri, Tim Althoff, and Yejin Choi
    In ICML, Jul 2024
  6. inductive.png
    Phenomenal yet puzzling: Testing inductive reasoning capabilities of language models with hypothesis refinement
    Linlu Qiu, Liwei Jiang, Ximing Lu, Melanie Sclar, Valentina Pyatkin, Chandra Bhagavatula, Bailin Wang, Yoon Kim, Yejin ChoiNouha Dziri, and Xiang Ren
    In ICLR, Jan 2024
  7. paradox.png
    The Generative AI Paradox: What It Can Create, It May Not Understand
    Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, Benjamin Newman, Pang Wei Koh, Allyson Ettinger, and Yejin Choi
    In ICLR, Jan 2024
  8. urial.png
    The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
    Bill Yuchen Lin, Abhilasha Ravichander, Ximing Lu, Nouha Dziri, Melanie Sclar, Khyathi Chandu, Chandra Bhagavatula, and Yejin Choi
    In ICLR, Jan 2024
  9. kaleido.png
    Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
    Taylor Sorensen, Liwei Jiang, Jena Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, and Yejin Choi
    In AAAI, Sep 2024

2023

  1. faith.png
    Faith and Fate: Limits of Transformers on Compositionality
    Nouha Dziri, Ximing Lu, Melanie Sclar, Xiang Lorraine Li, Liwei Jiang, Bill Yuchen Lin, Peter West, Chandra Bhagavatula, Ronan Le Bras, Jena D. Hwang, Soumya Sanyal, Sean Welleck, Xiang Ren, Allyson Ettinger, Zaid Harchaoui, and Yejin Choi
    In Neurips, Jun 2023
  2. rlhf.png
    Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
    Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah Smith, Mari Ostendorf, and Hannaneh Hajishirzi
    In Neurips, Jun 2023
  3. defeasible.png
    What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
    In EMNLP, May 2023
  4. ipa.png
    Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
    Ximing Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Chandu, Abhilasha Ravichander, Lianhui Qin, Prithviraj Ammanabrolu, Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian Fisher, Bill Yuchen Lin, Skyler Hallinan, Xiang Ren, Sean Welleck, and Yejin Choi
    In EMNLP, May 2023
  5. refine.png
    Self-refine: Iterative refinement with self-feedback
    Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Sean Welleck, Bodhisattwa Prasad Majumder, Shashank Gupta, Amir Yazdanbakhsh, and Peter Clark
    In Neurips, Mar 2023
  6. champagne.png
    CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos
    Seungju Han, Jack Hessel, Nouha DziriYejin Choi, and Youngjae Yu
    In ICCV, Mar 2023
  7. ewr.png
    Elastic weight removal for faithful and abstractive dialogue generation
    Nico Daheim, Nouha Dziri, Mrinmaya Sachan, Iryna Gurevych, and Edoardo M Ponti
    In arXiv, Mar 2023
  8. qa.png
    Evaluating Open-Domain Question Answering in the Era of Large Language Models
    Ehsan Kamalloo, Nouha Dziri, Charles Clarke, and Davood Rafiei
    In ACL (oral), Jul 2023

2022

  1. begin.png
    Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
    Nouha Dziri, Hannah Rashkin, Tal Linzen, and David Reitter
    TACL, May 2022
  2. faithdial.png
    FaithDial: A Faithful Benchmark for Information-Seeking Dialogue
    Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, Edoardo Ponti, and Siva Reddy
    TACL, Apr 2022
  3. hall.png
    On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?
    Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, and Siva Reddy
    NAACL, Jul 2022

2021

  1. nph.png
    Neural path hunter: Reducing hallucination in dialogue systems via path grounding
    Nouha Dziri, Andrea Madotto, Osmar Zaiane, and Avishek Joey Bose
    EMNLP, Nov 2021
  2. demi.png
    Decomposed mutual information estimation for contrastive representation learning
    Alessandro Sordoni*, Nouha Dziri*, Hannes Schulz*, Geoff Gordon, Philip Bachman, and Remi Tachet Des Combes
    ICML, Jul 2021

2019

  1. eval.png
    Evaluating Coherence in Dialogue Systems using Entailment
    Nouha Dziri, Ehsan Kamalloo, Kory Mathewson, and Osmar Zaiane
    In NAACL-HLT, Jun 2019
  2. THRED.png
    Augmenting Neural Response Generation with Context-Aware Topical Attention
    Nouha Dziri, Ehsan Kamalloo, Kory Mathewson, and Osmar Zaiane
    In Proceedings of the First Workshop on NLP for Conversational AI (NLP4ConvAI) at ACL 2019, Aug 2019

2018

  1. emotion.png
    Automatic Dialogue Generation with Expressed Emotions
    Chenyang Huang, Osmar Zaı̈ane, Amine Trabelsi, and Nouha Dziri
    In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), Jun 2018