Agent Hijacking
What Is Agent Hijacking?
As the digital asset ecosystem moves toward the use of agentic AI, autonomous software entities are increasingly granted the authority to move value and interact with smart contracts. Agent hijacking occurs when an unauthorized party gains control over these entities. Unlike a traditional wallet hack where a private key is stolen, agent hijacking often involves manipulating the agent’s reasoning or execution environment to force it into making harmful decisions.
This threat is particularly significant because autonomous agents are designed to operate without constant human supervision. If an agent is hijacked, it can execute thousands of malicious transactions in the time it takes for a human owner to notice a discrepancy. The goal of the attacker is typically to drain the agent’s connected wallet or to use the agent’s reputation and permissions to facilitate larger exploits within the machine-to-machine economy.
How Agent Hijacking Works
Attackers utilize several vectors to compromise an autonomous agent. One common method is prompt injection, where a malicious actor feeds the agent specifically crafted data that overrides its original instructions. For example, an agent designed to find the best yield might be tricked by an injection attack into treating a fraudulent high-risk protocol as safe and optimal.
Another vector is the compromise of the agent’s execution environment. Since agents often run on cloud servers or internet-connected devices, they are vulnerable to standard software exploits. If an attacker gains access to the server, they can intercept the agent’s session keys or replace the agent’s logic with their own. Because these agents often use delegated authority to sign for small transactions, the hijacker can use those pre-approved permissions to move funds to an external address until the session expires or the budget set by a policy engine is exhausted.
Defending Against Hijacking
While automation provides efficiency, the ultimate authority must remain with a human owner using a signer (hardware wallet). In addition, by maintaining your primary wealth in an account fully backed by your signer and only delegating limited permissions to an agent, you ensure that a hijacked agent can never access your entire portfolio.