large language models

Do Membership Inference Attacks Work on Large Language Models?

A large-scale evaluation of membership inference attacks (MIAs) on LLMs shows that MIAs perform barely better than random guessing, attributed to large datasets, few training iterations, and fuzzy boundaries between data members.

Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, Hannaneh Hajishirzi

My submission to the TDC Trojan Detection Challenge

Description of my entry to the TDC Trojan Detection challenge (co-located with NeurIPS 2023)

Anshuman Suri

Last updated on Dec 18, 2023 8 min read

My submission to the TDC Trojan Detection Challenge

SoK: Memorization in General-Purpose Large Language Models

We explore the memorization capabilities of Large Language Models (LLMs), categorizing them into six types, and discuss their implications and challenges.

Valentin Hartmann, Anshuman Suri, Vincent Bindschaedler, David Evans, Shruti Tople, Robert West