In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI Shayne Longpre et al.
ICML 2025 [Paper]
Language model developers should report train-test overlap Andy Zhang et al.
ICML 2025 [Paper]
Toward an Evaluation Science for Generative AI Systems Laura Weidinger*, Inioluwa Deborah Raji*, et al.
National Academy of Engineering 2025 [Paper]
Beyond Release: Access Considerations for Generative AI Systems Irene Solaiman, Rishi Bommasani et al.
[Paper]
The Reality of AI and Biorisk Aidan Peppin et al.
FAccT 2025 [Paper]
Considerations for Governing Open Foundation Models Rishi Bommasani et al.
Science 2024 [Paper]
Effective Mitigations for Systemic Risks from General-Purpose AI Risto Uuk et al.
[Paper]
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources Shayne Longpre*, Stella Biderman* et al.
TMLR 2024 (Outstanding Survey Paper) [Paper][Website]
Interim International Scientific Report on the Safety of Advanced AI Yoshua Bengio et al.
2024 AI Seoul Summit [Paper][Website]
The 2024 Foundation Model Transparency Index Rishi Bommasani*, Kevin Klyman* et al.
TMLR 2024 [Paper][Website][Data]
On the Societal Impact of Open Foundation Models Sayash Kapoor*, Rishi Bommasani* et al.
ICML 2024 (Oral, top 1.5% of papers) [Paper][Website]
A Safe Harbor for AI Evaluation and Red Teaming Shayne Longpre et al.
ICML 2024 (Oral, top 1.5% of papers) [Paper][Website]
Foundation Model Transparency Reports Rishi Bommasani et al.
AIES 2024 (Oral, top 1.5% of papers) [Paper]
Ecosystem Graphs: The Social Footprint of Foundation Models Rishi Bommasani et al.
AIES 2024 [Paper][Website][Blog][Code]
AI Regulation Has Its Own Alignment Problem: The Technical and Institutional Feasibility of Disclosure, Registration, Licensing, and Auditing Neel Guha*, Christie M. Lawrence* et al.
George Washington Law Review 2024 [Paper][Policy Brief]
The 2023 Foundation Model Transparency Index Rishi Bommasani*, Kevin Klyman* et al.
TMLR 2024 (Outstanding Paper) [Paper][Website][Blog][Data]
Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes Connor Toups*, Rishi Bommasani* et al.
NeurIPS 2023 [Paper][Code]
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs Deepak Narayanan et al.
NeurIPS 2023 [Paper]
Evaluation for Change Rishi Bommasani
ACL 2023 [Paper]
Evaluating Human-Language Model Interaction Mina Lee et al.
TMLR 2023 [Paper][Code]
Holistic Evaluation of Language Models Percy Liang*, Rishi Bommasani*, Tony Lee* et al.
TMLR 2023 (Best Paper) [Paper][Website][Blog][Code]
Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization? Rishi Bommasani et al.
NeurIPS 2022 [Paper][Code]
Emergent Abilities of Large Language Models Jason Wei et al.
TMLR 2022 (Outstanding Survey Paper) [Paper][Blog]
Data Governance in the Age of Large-Scale Data-Driven Language Technology Yacine Jernite et al.
FAccT 2022 [Paper]
The Time Is Now to Develop Community Norms for the Release of Foundation Models Percy Liang, Rishi Bommasani, Kathleen A. Creel, Rob Reich
[Blog][Op-ed]