top of page
Search

Attacks on GenAI

  • Writer: Katarzyna  Celińska
    Katarzyna Celińska
  • 10 hours ago
  • 2 min read

As organizations integrate GenAI across business functions, a new cybersecurity frontier is emerging—prompt-based adversarial attacks. Palo Alto Networks’ report, “Securing GenAI: A Comprehensive Report on Prompt Attacks”, outlines how hasztag#promptinjections can breach guardrails, hijack model goals, and leak sensitive data—all by exploiting the very inputs GenAI thrives on.

 

Top Threat Vectors Identified:

Guardrail Bypass – Models manipulated through persistent or obfuscated prompts to disclose harmful content or violate safeguards

Information Leakage – Sensitive data, including training artifacts or system instructions, can be exfiltrated through cleverly crafted questions

Goal Hijacking – Systems redirected to execute actions contrary to their intended purpose

Infrastructure Attacks – Remote code execution and resource-draining prompts can cripple applications or lead to malware deployment

 

Attack Techniques Include:

☑️ Payload splitting and prompt leakage

☑️ Memory corruption in AI agents

☑️ Obfuscation via cipher/flip/encoding

☑️ Multimodal jailbreaks (image/audio)

☑️ “Storytelling” and social engineering scenarios

 



These methods reveal that natural language is the new attack surface, and conventional controls are not enough.

 

Understanding these threats isn't just helpful—it's urgent. As hasztag#AI becomes embedded into critical systems, from healthcare to financial services, the volume and sophistication of attacks will grow exponentially. The more we rely on GenAI, the more attackers will innovate ways to compromise it.

For IT auditors, cybersecurity specialists, and GRC professionals, it’s now essential to:

Continuously learn about AI-specific attack patterns and defenses

Shift from static audit checklists to dynamic, AI-aware controls

Embed threat detection within every layer of the GenAI architecture

Collaborate with DevSecOps teams to test and stress AI pipelines for prompt-based exploitability

 

Just as security teams use MITRE ATT&CK to understand traditional threats, in the AI landscape, the MITREATLAS matrix is crucial.

 

ATLAS provides a comprehensive framework for:

Understanding how adversaries target AI models and agents

Mapping real-world AI attack scenarios

Designing defensive measures tailored to machine learning environments

 

If you don’t know how prompt attacks work, you won’t know when your GenAI system has been compromised. And if you don’t understand ATLAS, you won’t know what defenses to prioritize.



 
 
 

Comments


Stay in touch

BW ADVISORY sp. z o.o. 

ul. Boczańska 25
03-156 Warszawa
NIP: 525-281-83-52

Privacy policy

  • LinkedIn
  • Youtube
bottom of page