Large language models such as GPT-3, which use machine learning to generate text, present a turning point for security, according to research from cybersecurity firm WithSecure.
Recent research from WithSecure conducted a series of experiments using prompt engineering – a method for discovering inputs that yield desirable results – to create harmful content.
These experiments covered phishing and spear-phishing, harassment, social validation for scams, the appropriation of a written style, the creation of deliberately divisive opinions, using the models to create prompts for malicious text, and fake news.
The results led the researchers to conclude that prompt engineering will develop as a discipline, and that adversaries will develop capabilities enabled by large language models in unpredictable ways. They also found that identifying malicious or abusive content will become more difficult for platform providers, and that large language models already give criminals the ability to make targeted communication as part of an attack more effective. According to WithSecure researchers, "We’ll need mechanisms to identify malicious content generated by large language models. One step towards the goal would be to identify that content was generated by those models. However, that alone would not be sufficient, given that large language models will also be used to generate legitimate content."