preprints

papers in submission by categories.

The Revealed Preferences of Pre-authorized Licenses and Their Ethical Implications for Generative Models
Suriyakumar, V.M., P. Menell, D. Hadfield-Menell, A. Wilson. GenLaw Workshop at ICML 2024. (by request)

Security & Safety

The Role of Temperature Sampling in Adaptive Language Model Evaluation.
In Submission.
Alur, R.*, V.M.Suriyakumar*, J. S. Sekhon, M. Raghavan, A. Wilson. 2026. * equal contribution

The Moderation Learning Curve for Open Generative AI Intermediaries.
In Submission.
Suriyakumar, V.M., T. Gillespie, A. Wilson. 2026.

On Watermarking Protein Language Models: Possibilities and Limitations.
Technical Report.
Suriyakumar, V.M. 2025 (by request)

Poison-then-Hide: Finetuning-Activated Backdoor Attack on Pretrained Vision Encoders.
In Submission.
Jin, A., A. Gourabathina, V.M. Suriyakumar, W. Gerych, M.Ghassemi. 2025 (by request)

TOGA: Trigger Optimization for Clean Data Ordering Backdoor Attack
In Submission
Jin, A., W. Gerych, A. Gourabathina, V.M. Suriyakumar, M.Ghassemi. 2025 (by request)

UCD: Unlearning in LLMs with Contrastive Decoding
In Submission
Suriyakumar, V.M., A. Sekhari*, A. Wilson*. 2025. * Denotes equal supervision

Layered Unlearning for Adversarial Relearning
In Submission
Qian, T., V.M. Suriyakumar, A. Wilson, D. Hadfield-Menell. 2025

Fairness

Iterative Nullifcation Transforms for Debiasing Vision-Language Models at Test-Time.
In Submission
Q. Perian, V.M. Suriyakumar, M.Ghassemi. 2025.