AI models can acquire backdoors from surprisingly few malicious documents
Anthropic study suggests “poison” training attacks don’t scale with model size. Scraping the open web for AI training data can have its drawbacks. On Thursday, researchers from Anthropic, the UK AI Security Institute, and the Alan Turing Institute released a preprint research paper suggesting that large language models like…