AI cyberattacks: How hackers used AI to automate espionage
Famously, artificial intelligence tools promise to help businesses automate tasks, increase efficiency, and free up employees to do more with less. But what happens when hackers use general-purpose AI tools for highly illegal and potentially dangerous activities?
It’s no secret that cybercriminals can weaponize AI, but until now, AI has mostly assisted criminals in carrying out cyberattacks. However, with recent advances making AI-based tools more powerful, it’s getting easier to more fully automate cybercrime.
Recently, the makers of the Claude AI, Anthropic, discovered that malicious actors had bypassed their safeguards and used the Claude Code tool to target dozens of organizations.
Here’s the important and scary part: Anthropic estimates that 80-90% of the hacking campaign was run completely by AI without human intervention.
Who carried out the automated AI hacking campaign?
In their recent blog post, Disrupting the first reported AI-orchestrated cyber espionage campaign, Anthropic revealed that a Chinese-sponsored hacking group had used their tools to attack, and in some cases successfully infiltrate organizations including “large tech companies, financial institutions, and government agencies” in what they describe as the “first documented case of a large-scale cyberattack executed without substantial human intervention.”
This campaign employed AI agents that autonomously carry out tasks without human intervention, which allowed hackers to increase the scope and scale of their activities. Anthropic warns that this is likely just the beginning, and that their teams are working hard to prevent future attacks of this nature.
How did hackers trick AI agents into hacking companies?
As in futuristic science fiction, present-day AI platforms — and the humans that use them — are supposed to abide by a code of conduct that outlines what they are not allowed to do. The Anthropic usage policy mandates the following restrictions, among other things:
- Do not violate applicable laws or engage in illegal activity
- Do not compromise critical infrastructure
- Do not compromise computer or network systems
So how did hackers bypass these guardrails? They had to “jailbreak” Claude by claiming that they worked for a cybersecurity firm, and convince the AI that the various tasks they were automating were just for testing purposes.
What tasks did hackers automate using Claude?
According to Anthropic, the malicious actors broke up their hacking workflow into smaller tasks, which they then chained together.
Once humans identified their targets, the automated tasks included: Searching for vulnerabilities, identifying high-value databases, harvesting usernames and passwords, creating backdoors to ensure they could have access, and exfiltrating sensitive data. AI was even used to document the work that had been done, to keep the humans carrying out the cyberattack in the loop.
What are the implications of AI agents being used in cyberattacks?
While AI can certainly lower the barriers to entry to creating useful software (i.e., vibe coding), with the good also comes the bad. Hackers who don’t know how to code can now easily create entire workflows and campaigns, and even write ransomware through natural-language prompts.
Additionally, because AI agents can act without the need for human intervention, they can actually carry out the attacks autonomously and at scale. According to Anthropic, the attackers in this case saved vast amounts of time by enlisting the help of AI, and “at the peak of its attack, the AI made thousands of requests, often multiple per second—an attack speed that would have been, for human hackers, simply impossible to match.”
What’s next in the cybersecurity AI arms race?
In the future, while AI will increasingly be used to launch cyber attacks, AI will also be employed on the defensive side. Security tools used by businesses are increasingly getting AI functionality that can help detect attacks before a human could.
Additionally, the AI companies themselves are also working to make it harder for cybercriminals to use their tools for illegal activities. Anthropic’s threat intelligence team is responsible for detecting misuse of their tools, understanding what went wrong, and building defenses so bad actors can’t abuse systems in the future.
For IT professionals tasked with keeping their companies safe, it’s important to stay current on the ever-evolving threat landscape, especially as software capabilities evolve rapidly and shift our cybersecurity practices and paradigms in the age of AI.