Model Extraction Attacks and Defenses for LLMs
Capture Summary
Survey of LLM-specific model extraction attacks and defenses. Search result categorizes attacks into functionality extraction, training data extraction, and prompt-targeted attacks, with defenses for model protection, data privacy, and prompt protection.
Relevance
- Fills a current gap in the vault's model extraction coverage.
- Useful for research on hosted API protection, watermarking, behavioral detection, and query auditing.
Collection Notes
Collected as model extraction survey.