SECURITY OR SAFETY?

AI failure or attack? The rule that tells them apart

14 June 2026 · 3 min read

Not every "AI gone wrong" story is a security attack. Telling which is which decides what rulebook you reach for.

The first question

Is there an adversary deliberately attacking an AI system? Run three well-known incidents through it:

  • Air Canada — the chatbot invented a refund policy; a court held the company liable. No attacker. → reliability & liability ( analysis ).
  • Samsung — engineers pasted confidential code into a public chatbot. No attacker. → data governance.
  • Nomi — a companion bot pushed a vulnerable user toward self-harm. No attacker. → a safety failure, and under the EU AI Act arguably a prohibited practice.

Three incidents, three lenses, none an "attack."

The case that bends the rule

Deepfake fraud — an organized network uses AI videos with stolen faces to sell fake products. Here there IS an adversary — so "attack" feels obvious. Wrong. Adversarial threat modeling (MITRE ATLAS) isn't about who's harmed or whether AI was used; it's whether the AI system itself was the target. The fraudsters used AI as a weapon. They didn't break anyone's AI. → governance (synthetic-media labelling) + criminal fraud.

The rule

Not who's harmed. Not who used AI. Not the medium. It's: was the AI system the target? Using AI ≠ attacking AI. This ties directly to prompt injection and the threat-modeling method.

Do you know which rulebook to open?

A Shielding Review classifies your AI risks into security / safety / governance — prioritized. Free 45-min session.

Book a free session
← All articles