LLMs Agents with OpenAI's GPT-4 Can Now Hack Your Website - How to Protect Your Website?

18 Feb 2024

A recent research paper titled "LLM Agents can Autonomously Hack Websites" revealed a surprising proficiency in executing complex cyber-attacks without human guidance. Imagine a world where AI doesn't just write essays or compose music but also picks digital locks with the finesse of a seasoned hacker. Sounds like a sci-fi thriller, doesn't it? Yet, here we are.

How?

Reaserchers used LLMs agents coupled with different AI models. LLM agents are applications that can execute complex tasks. When building LLM agents, an LLM serves as the main controller or "brain" that controls a flow of operations needed to complete a task. Picture an orchestra where GPT-4 is the conductor, guiding a symphony of operations to exploit vulnerabilities with a success rate that would make any black hat tip their hat in respect.

Here, for a given attack, the LLM agent will take several actions such as navigates web pages, interacts with elements, and even reads and utilizes documents related to web security to inform how to attack a website. For example, the SQL union attack requires (on average) 44.3 actions.

Some of those LLM agents (GPT-4) will attempt one attack, realizes it does not work, backtracks, and performs another attack.

Results

GPT-4's hacking abilities on sandboxed (in a controlled environment) websites designed to mirror real-world vulnerabilities. GPT-4 successfully hacked 73.3% of these test vulnerabilities (with 5 tries). This success rate starkly contrasts with GPT-3.5 and other open-source models such as Mistral or LLaMA-2:

Agent	5 tries success rate	Overall success rate
GPT-4	73.3%	42.7%
GPT-3.5	6.7%	2.7%
OpenHermes-2.5-Mistral-7B	0.0%	0.0%
LLaMA-2 Chat (70B)	0.0%	0.0%
LLaMA-2 Chat (13B)	0.0%	0.0%
LLaMA-2 Chat (7B)	0.0%	0.0%
Mixtral-8x7B Instruct	0.0%	0.0%
Mistral (7B) Instruct v0.2	0.0%	0.0%
Nous Hermes-2 Yi (34B)	0.0%	0.0%
OpenChat 3.5	0.0%	0.0%

And here the success rate per type of Vulnerability for GPT-4 and 3.5:

Vulnerability	GPT-4 success rate	GPT-4 3.5 detection rate
SQL Injection	100%	100%
CSRF	100%	60%
XSS	80%	40%
Brute Force	80%	60%
File upload	40%	80%
SQL Union	80%	0%
LFI	60%	40%
SSTI	40%	0%
Webhook XSS	20%	0%
Hard SQL union	20%	0%
SSRF	20%	0%
Authorization bypass	0%	0%
Javascript attacks	0%	0%
Hard SQL injection	0%	0%
XSS + CSRF	0%	0%

How about real world websites? However, in real-world testing on 50 seemingly unmaintained websites, GPT-4's success was limited to finding an XSS vulnerability on just one site. This suggests that while the AI's capabilities are noteworthy, they're not yet omnipotent in overcoming well-maintained defenses.

Cost Analysis

The cost analysis within the study brings an economic perspective, highlighting the viability of AI in conducting cyber-attacks. Deploying GPT-4 costs approximately $9.81, starkly less than the $80 estimated for a human cybersecurity analyst to perform similar tasks. This cost-effectiveness could unfortunately make such AI a tool for malicious use if left unchecked.

How to protect yourself from LLMs attacks?

Use Web Application Firewalls: A Web Application Firewall (WAF) can act as a gatekeeper against SQL Injections, XSS, and CSRF attacks by filtering out malicious data before it reaches your website. Given GPT-4's high success rate in these areas, a WAF provides an essential layer of defense by blocking known attack vectors.
Update and Patch Regularly: Vulnerabilities like SQL Injection and Cross-Site Scripting (XSS) often exploit outdated software. By ensuring that all components of your web application are up-to-date, you can close the gaps that attackers, including sophisticated LLM agents like GPT-4, use to penetrate systems.
Implement Secure Coding Practices: Secure coding practices directly counteract techniques used in SQL Union, LFI (Local File Inclusion), and SSTI (Server-Side Template Injection) attacks. By validating user inputs and employing secure coding standards, you can significantly reduce the attack surface available to LLM agents, thereby preventing many of the automated attacks from succeeding.
Conduct Regular Security Audits and Penetration Testing: Regular security audits and penetration tests can uncover vulnerabilities that might be exploited by Brute Force attacks or File Upload vulnerabilities. These proactive measures ensure that potential security gaps are identified and remediated before they can be exploited by autonomous systems like GPT-4, which demonstrated capabilities in these specific attack vectors.
Implement Honeypots as Early Detection Traps: Deploy strategically honeypots in your system: Add "fake" vulnerabilities within your website's code or infrastructure. A honeypot mimic real weaknesses but not actually function, acting as bait for LLMs attempting attacks.

While GPT-4's hacking abilities are a testament to AI's evolving capabilities, they also underscore the importance of proactive and robust web security measures. The AI's limited success in real-world scenarios indicates that well-maintained and regularly updated websites remain formidable against such attacks.

Implementing the recommended security measures not only fortifies your digital presence against current AI threats but also prepares you for future advancements in AI-driven cyber-attacks. It's a clear signal to web professionals and developers that in the arms race between cyber defenses and threats, staying informed and vigilant is more crucial than ever.