LLMs Agents with OpenAI's GPT-4 Can Now Hack Your Website - How to Protect Your Website?

A recent research paper titled "LLM Agents can Autonomously Hack Websites" revealed a surprising proficiency in executing complex cyber-attacks without human guidance. Imagine a world where AI doesn't just write essays or compose music but also picks digital locks with the finesse of a seasoned hacker. Sounds like a sci-fi thriller, doesn't it? Yet, here we are.


Reaserchers used LLMs agents coupled with different AI models. LLM agents are applications that can execute complex tasks. When building LLM agents, an LLM serves as the main controller or "brain" that controls a flow of operations needed to complete a task. Picture an orchestra where GPT-4 is the conductor, guiding a symphony of operations to exploit vulnerabilities with a success rate that would make any black hat tip their hat in respect.

Here, for a given attack, the LLM agent will take several actions such as navigates web pages, interacts with elements, and even reads and utilizes documents related to web security to inform how to attack a website. For example, the SQL union attack requires (on average) 44.3 actions.

Some of those LLM agents (GPT-4) will attempt one attack, realizes it does not work, backtracks, and performs another attack.


GPT-4's hacking abilities on sandboxed (in a controlled environment) websites designed to mirror real-world vulnerabilities. GPT-4 successfully hacked 73.3% of these test vulnerabilities (with 5 tries). This success rate starkly contrasts with GPT-3.5 and other open-source models such as Mistral or LLaMA-2:

Agent 5 tries success rate Overall success rate
GPT-4 73.3% 42.7%
GPT-3.5 6.7% 2.7%
OpenHermes-2.5-Mistral-7B 0.0% 0.0%
LLaMA-2 Chat (70B) 0.0% 0.0%
LLaMA-2 Chat (13B) 0.0% 0.0%
LLaMA-2 Chat (7B) 0.0% 0.0%
Mixtral-8x7B Instruct 0.0% 0.0%
Mistral (7B) Instruct v0.2 0.0% 0.0%
Nous Hermes-2 Yi (34B) 0.0% 0.0%
OpenChat 3.5 0.0% 0.0%

And here the success rate per type of Vulnerability for GPT-4 and 3.5:

Vulnerability GPT-4 success rate GPT-4 3.5 detection rate
SQL Injection 100% 100%
CSRF 100% 60%
XSS 80% 40%
Brute Force 80% 60%
File upload 40% 80%
SQL Union 80% 0%
LFI 60% 40%
SSTI 40% 0%
Webhook XSS 20% 0%
Hard SQL union 20% 0%
SSRF 20% 0%
Authorization bypass 0% 0%
Javascript attacks 0% 0%
Hard SQL injection 0% 0%
XSS + CSRF 0% 0%

How about real world websites? However, in real-world testing on 50 seemingly unmaintained websites, GPT-4's success was limited to finding an XSS vulnerability on just one site. This suggests that while the AI's capabilities are noteworthy, they're not yet omnipotent in overcoming well-maintained defenses.

Cost Analysis

The cost analysis within the study brings an economic perspective, highlighting the viability of AI in conducting cyber-attacks. Deploying GPT-4 costs approximately $9.81, starkly less than the $80 estimated for a human cybersecurity analyst to perform similar tasks. This cost-effectiveness could unfortunately make such AI a tool for malicious use if left unchecked.

How to protect yourself from LLMs attacks?

  1. Use Web Application Firewalls: A Web Application Firewall (WAF) can act as a gatekeeper against SQL Injections, XSS, and CSRF attacks by filtering out malicious data before it reaches your website. Given GPT-4's high success rate in these areas, a WAF provides an essential layer of defense by blocking known attack vectors.

  2. Update and Patch Regularly: Vulnerabilities like SQL Injection and Cross-Site Scripting (XSS) often exploit outdated software. By ensuring that all components of your web application are up-to-date, you can close the gaps that attackers, including sophisticated LLM agents like GPT-4, use to penetrate systems.

  3. Implement Secure Coding Practices: Secure coding practices directly counteract techniques used in SQL Union, LFI (Local File Inclusion), and SSTI (Server-Side Template Injection) attacks. By validating user inputs and employing secure coding standards, you can significantly reduce the attack surface available to LLM agents, thereby preventing many of the automated attacks from succeeding.

  4. Conduct Regular Security Audits and Penetration Testing: Regular security audits and penetration tests can uncover vulnerabilities that might be exploited by Brute Force attacks or File Upload vulnerabilities. These proactive measures ensure that potential security gaps are identified and remediated before they can be exploited by autonomous systems like GPT-4, which demonstrated capabilities in these specific attack vectors.

  5. Implement Honeypots as Early Detection Traps: Deploy strategically honeypots in your system: Add "fake" vulnerabilities within your website's code or infrastructure. A honeypot mimic real weaknesses but not actually function, acting as bait for LLMs attempting attacks.

While GPT-4's hacking abilities are a testament to AI's evolving capabilities, they also underscore the importance of proactive and robust web security measures. The AI's limited success in real-world scenarios indicates that well-maintained and regularly updated websites remain formidable against such attacks.

Implementing the recommended security measures not only fortifies your digital presence against current AI threats but also prepares you for future advancements in AI-driven cyber-attacks. It's a clear signal to web professionals and developers that in the arms race between cyber defenses and threats, staying informed and vigilant is more crucial than ever.

You might also be interested in: