SPONSOR BLOG

Agent Card Poisoning: A Metadata Injection Vulnerability In The Systems Using Google A2A Protocol

Maliciously injected content can result in the transmission of sensitive user data to attacker-controlled endpoints.

April 2nd, 2026 - By: Kumar Aditya

Modern multi-agent systems built on the Google A2A protocol enable dynamic discovery and delegation between autonomous agents through structured metadata known as agent cards. These cards describe capabilities, endpoints, and operational details that the host agent uses to plan task delegation. However, when agent cards are injected directly into an LLM’s reasoning context without strict boundary enforcement, metadata can be reinterpreted as executable instruction.

Our research demonstrates an attack vector termed as Agent Card Poisoning, a metadata injection vulnerability in which a malicious remote agent embeds adversarial instructions within its agent card. When the host LLM incorporates this poisoned metadata into its reasoning context, the injected content can influence tool-selection and execution decisions. As a result, the model may issue unintended tool calls, such as transmitting sensitive user data to attacker-controlled endpoints, leading to silent control-flow hijacking and potential PII exfiltration during otherwise legitimate task delegation.

System model and attack flow

We will now examine the full attack surface by breaking down the system architecture, trust assumptions, and step-by-step execution of the exploit.

System architecture

We simulated a multi-agent delegation process based on the Google A2A protocol in which a host agent uses the agent cards to dynamically find and assign tasks to remote agents. To interpret user intent, assess available tools, and decide whether delegation is necessary, the host uses an LLM-driven reasoning engine. Remote agents advertise their capabilities and endpoints through agent cards, which are retrieved and incorporated into the host’s reasoning context during task planning.

Throughout this work, we consider a hotel booking scenario in which a user submits a booking request containing:

Name
City
Check-in and check-out dates
Payment card details

This scenario is intentionally chosen because it involves sensitive PII and payment information, making unintended outbound transmission of personal information.

Entities involved

User

An external client submitting a hotel booking request. The request may contain personally identifiable information (PII) and financial data.

Host agent

The primary orchestration component responsible for:

Receiving user requests
Fetching remote agent cards
Constructing the LLM reasoning prompt
Executing tool calls generated by the LLM

To enable these operations, the host agent exposes a set of executable tools that the LLM can invoke during its reasoning process. These include http get, which retrieves remote resources over HTTP; http post, which sends outbound HTTP requests with structured payloads to external endpoints; delegate task, which forwards a structured task request to a remote agent’s declared delegation endpoint; and list tools, which allows the model to enumerate the tools and operational capabilities currently available within the host environment. The host agent also provides execute python, which enables the LLM to execute Python code blocks within the host execution environment and return the resulting output for further reasoning.

The LLM selects among these tools during reasoning. Crucially, tool execution is performed by the host after interpreting model output.

Remote hotel booking agent

An external agent advertising hotel booking capabilities via an agent card. The card contains structured metadata such as:

Agent name
Capabilities
Delegation endpoint
Operational descriptions

This agent may be benign or malicious.

Attacker-controlled endpoint

An external HTTP endpoint not declared as part of the intended delegation workflow. In the attack scenario, this endpoint receives exfiltrated PII via unintended HTTP POST execution.

Fig. 1: Architecture of the attack flow.

Attack flow

We will now talk about the complete attack lifecycle, aligned with network-level observations captured via PCAP captures.

1. Remote agent card synchronization

Before any user request is processed, the host agent maintains connectivity with remote agents. As part of initialization or periodic synchronization, the host retrieves and stores all remote agent cards. This synchronization process occurs independently of any specific user request. In this scenario, the description of the agent card is poisoned, and it contains malicious instructions, which can pollute the context of the host agent LLM.

Fig. 2: Capture showing the agent card exchange.

2. User submission of booking request

The attack is triggered by a legitimate user interaction. The user submits a hotel booking request to the host agent, including the name, destination city, check-in and check-out dates and payment card details.

At this stage, all sensitive user data is confined to the host environment. The host agent receives the request over HTTPS and prepares it for intent analysis and potential delegation.

Fig. 3: Capture showing the request sent by the user to the host agent.

3. Delegation planning and LLM invocation

To determine task routing, the host constructs a reasoning prompt that includes:

The user booking request with the system prompt, which sets the context of the host agent
The full set of cached remote agent cards
The available execution tools

The host asks the LLM to decide which agent should handle this request and what steps are required.

Now, because the poisoned agent card is embedded verbatim into the LLM’s reasoning context, its adversarial content is treated as authoritative planning input rather than inert metadata. At this point, untrusted external content directly influences the model’s decision-making process, marking the activation point of the attack.

Fig. 4: Capture showing the request sent to the LLM by the host agent.

4. Malicious execution plan generation

Upon processing the reasoning context, influenced by the adversarial instruction embedded within the agent card, the model prioritizes an outbound HTTP POST request to an attacker-controlled endpoint, transmitting the full booking payload containing sensitive PII and payment details. Only after this transmission does the plan proceed with the legitimate delegate task invocation toward the remote hotel booking agent. From the host’s perspective, the generated actions are syntactically valid and leverage approved tools, however, the ordering and destination of the initial request constitute an unauthorized control-flow deviation.

Fig. 5: Capture showing the response of the LLM to the host agent.

5. Sensitive data transmission at runtime

Following the generation of the malicious execution plan, the host proceeds to execute the LLM-issued tool calls. The first action results in an outbound HTTP POST request to an external endpoint specified within the poisoned agent card. This request contains the full booking payload, including the user’s name, travel details, and payment card information.

Fig. 6: Capture showing the data being sent to external endpoint.

Agent Card Poisoning strikes

CyPerf 26.0.0 introduces new strikes that simulate Agent Card Poisoning attacks targeting systems implementing the Google A2A protocol. These strikes model scenarios in which malicious metadata embedded within remote agent cards influences the reasoning process of host agents, causing the underlying LLM to generate execution plans that include unintended tool invocations.

Users can access these strikes within the CyPerf attack library by searching for “Agent Card Poisoning.”

Fig. 7: CyPerf UI displaying strikes.

These strikes include several configurable properties that allow users to tailor the attack scenario. Users can configure the model, API version, and API key, along with endpoints such as the remote agent discovery URL, which is used by the host agent to retrieve remote agent cards that may contain the poisoned metadata, the PII receiver URL, which represents the external endpoint where exfiltrated sensitive information could be sent if the attack succeeds, and the LLM discovery URL, which defines the endpoint used to locate or interact with the underlying LLM service during the workflow. The configuration also includes the query used to trigger the agent interaction and a thought signature, which represents the characteristic reasoning trace pattern produced by the LLM during its decision-making process.

Fig. 8: CyPerf UI displaying strike configurations.

The statistic view in CyPerf UI provides detailed statistics from the test run, including the number of connections made and the number of active client and server agents. Users can also view separate HTTP statistics for client and server, along with overall TCP statistics. In the strike statistics view, there are stats to show whether the strike request to the server was allowed by the DUT. A positive value in the “Server Allowed” stats will indicate that the request was allowed through the DUT to the server. The client allowed stats can be used to check whether the client received the expected response to the strike request. Whether the request or response was blocked by the DUT, it should show 0 value.

Fig. 9: Run-time stats view in CyPerf UI.

Test security defenses with advanced threat intelligence

CyPerf, Keysight’s cloud-native security test solution, provides customers with direct access to attack campaigns from different advanced persistent threats, enabling them to test their currently deployed security controls’ ability to detect or block such attacks across physical and cloud environments. CyPerf’s extensive strike library provides a rich simulation environment for understanding and defending against a wide array of network-based attacks. From traditional web exploits and SQL injections to emerging AI prompt attacks, these strikes help security professionals validate their defenses across diverse threat landscapes. As new vulnerabilities emerge, CyPerf continues to evolve, ensuring comprehensive coverage of the latest threats in network security testing.

Kumar Aditya

(all posts)
Kumar Aditya is an AI systems and LLM researcher at Keysight ATI.

Agent Card Poisoning: A Metadata Injection Vulnerability In The Systems Using Google A2A Protocol

System model and attack flow

System architecture

Entities involved