The UK’s new AI safety body has found that advanced AI models may have the ability to enhance the capabilities of novices looking to carry out a cyberattack.
Several concerns have been raised in newly published findings from the UK AI Safety Institute’s (AISI) initial research into large language models (LLMs).
In collaboration with cybersecurity consulting firm Trial of Bits, the AISI assessed the extent to which LLMs could make it easier for a novice to carry out a cyberattack.
The AISI found that AI could help novices with some malicious tasks. In one instance, the institute found that a LLM was able to create a “highly convincing social media persona” for a “simulated social network which could hypothetically be used to spread disinformation in a real-world setting.”
The AISI noted that the LLM could have easily scaled this up to thousands of personas with minimal effort.
Advanced AI models can also coach and give specific troubleshooting advice on harmful applications, the institute found, potentially making it more of a threat than searching the web.
How well do you really know your competitors?
Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.
Thank you!
Your download email will arrive shortly
Not ready to buy yet? Download a free sample
We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form
By GlobalDataHowever, although LLMs work faster than web searching, in many instances both provided broadly the same level of information to users, the AISI said.
The AISI also found that it was possible to bypass safeguards for LLMs by using basic prompts.
“Using basic prompting techniques, users were able to successfully break the LLM’s safeguards immediately, obtaining assistance for a dual-use task,” according to the AISI.
Further sophisticated jailbreaking techniques only took a couple of hours and could be completed by “relatively low-skilled actors,” according to the findings.
“In some cases, such techniques were not even necessary as safeguards did not trigger when seeking out harmful information,” the AISI added.