The Trust Challenge in AI-Driven Solutions
Many organisations are adopting AI agents to streamline workloads and improve efficiency. However, these agents introduce unique security challenges.
To function effectively, agents require access to data and the ability to perform actions, like a human user. At the same time, their behaviour is not always fully predictable, and decision-making can occur in a “black box”. As agent capabilities increase, so does the potential for unintended, high-impact outcomes.
The table below highlights some of the most common security risks and why they matter to an organisation.
|
Risk Area |
Definition |
Examples of Impacts |
|
Prompt Injection |
Harmful or misleading user inputs prompt to influence agent behaviour |
Scenario: A public-facing agent is manipulated by a user to bypass its safety constraints and initiate an unauthorised funds transfer from a customer account. Impact: Execution of unauthorised financial transactions, causing direct financial loss, potential fraud investigations, and reputational damage with customers and regulators. |
|
Data Exfiltration |
Sensitive data being exposed to users without appropriate access |
Scenario: An employee asks a leave-planning agent to summarise their time off. The agent includes requests and medical details belonging to other employees. Impact: Disclosure of confidential employee and health information, leading to privacy violations, employee relations issues, and potential complaints. |
|
Over-permissioned tool/data access |
Agents having broader access to data or actions than required |
Scenario: An agent designed to summarise database information is also capable of creating or modifying records and begins to alter or generate data unintentionally. Impact: Uncontrolled changes to critical records, creating inconsistencies in business data, disrupting operations, and requiring manual remediation to restore accuracy. |
|
Cross-conversation/agent data leakage |
Information unintentionally leaking between conversations or agents |
Scenario: An agent references sensitive information from a separate user interaction when generating a response. Impact: Blending of user contexts resulting in inappropriate data exposure and incorrect actions and reducing confidence in system outputs. |
|
Hallucinations |
Agents generating incorrect information or taking inappropriate actions |
Scenario: An internal agent generates incorrect or fabricated compliance guidance, which is used to inform user decisions. Impact: Users act on misleading or incorrect guidance, increasing the likelihood of misinformed decisions, policy breaches, and downstream operational or legal consequences. |
|
Training data and Runtime data leakage |
Risks from mixing sensitive business data with training or testing data |
Scenario: Sensitive production data is used in training a public-facing agent, which subsequently exposes this information in responses. Impact: Unintended disclosure of sensitive production data through generated responses, risking external exposure, reputational harm, scrutiny from customers or stakeholders, and breaches in GDPR. |
For organisations to adopt AI agents with confidence, they need assurance that agents will operate securely and consistently within defined boundaries, aligning with internal policies without constant human intervention.
Why Built-in Safeguards Are Only Part of the Solution
AI agent platforms often include built-in safeguards designed to protect company data. Whilst they provide a solid foundation, on their own they are insufficient as they do not offer full security coverage. These safeguards broadly fall into two categories:
- Non-configurable controls: These are always-on protections that cannot be altered. For example, prompt injection protections in Copilot Studio can help reduce well known attack patterns, and tenant-level data isolation protects information from being exposed outside of the organisation.
- Configurable controls: These are adaptable protections that can be customised for each agent. For example, Copilot Studio uses the Power Platform and Dataverse security model where permissions define what an Agent can access and do. This allows organisations to enforce least-privilege access, ensuring Agents only interact with the data and actions required for their role.
Whilst these safeguards aim to mitigate common risks, gaps often remain. Gaps are especially noticeable around organisation or agent-specific requirements. These gaps must be deliberately addressed to provide a secure and trustworthy solution.
How Marra Designs Secure AI Solutions
When delivering a solution, we work closely with project teams to identify all specific risks and requirements. This allows us to identify and address the gaps left by the built-in safeguards.
We apply a set of design principles and controls that ensure agents operate securely within defined boundaries. The following approaches form the foundation that we use to design, build, and govern Agentic solutions.
- Agent design and data mapping: We define agent personas alongside user personas, map end-to-end journeys, and document data lineage to fully understand how Agents interact with users and data. This embeds security by design and enforces enterprise data protection from the outset. It reduces the likelihood of data exfiltration and over-permissioned tools and limits the potential impact if these risks materialise.
- Architecture and capability scoping: We assess current processes and design a future-state architecture that delivers the required outcomes securely. By reducing unnecessary capability and distributing functionality across tightly scoped agents where appropriate, we enforce least privilege access. This directly mitigates over-permissioned tools and helps contain the impact of risks such as prompt injection and hallucinated actions.
- Access and dependency management: We configure downstream services to ensure identity models, permissions, and configurations align with enterprise standards before integration. Using identity-first controls such as Microsoft Entra ID, we enforce least privilege access, so agents and users operate within required data boundaries. By remediating excessive permissions and strengthening authentication we reduce the risk of data exfiltration and over-permissioned access.
- Risk-led design and continuous analysis: We analyse the proposed solution for risks before development begins and continue this assessment throughout delivery, maintaining clear documentation of findings and mitigations. This ensures risks are understood both individually and in combination. It reduces the overall impact of attacks such as prompt injection, where controls often limit consequences rather than just preventing occurrence, and strengthens resilience against cross-user data leakage and data exfiltration.
- Requirements-driven validation and testing: We validate solutions against functional and non-functional requirements through a combination of manual, automated, and user acceptance testing. This includes adversarial inputs, misuse scenarios, and edge cases designed to simulate real-world attacks. This reduces the likelihood of successful exploitation of risks such as prompt injection, hallucinations, and unintended or unauthorised actions.
- Operational readiness and governance: We provide structured documentation, upskill teams during handover, and enable organisations to take ownership of ongoing security. Combined with the use of monitoring and governance tools such as Microsoft Purview and Microsoft Sentinel, this ensures continued visibility and control post-deployment. This supports detection and response to risks including data exfiltration, cross-user data leakage, and emerging or unknown threats, ensuring security is maintained throughout the solution lifecycle.
By combining secure design principles, layered controls, rigorous validation, and ongoing governance, we ensure that the AI agents we develop remain effective, trustworthy and secure.
Key Takeaways for Securing AI Agents
AI agents offer significant opportunities to increase efficiency and automate processes, but introduce new risks around data access, control, and decision-making transparency. Built-in platform safeguards provide a strong starting point but must be enhanced to reflect real-world organisational needs and risk profiles.
A secure AI agent requires intentional design, ensuring agents operate within clearly defined boundaries and with appropriate permissions. Combining governance, data protection, and monitoring capabilities ensures organisations can maintain control and visibility over AI-driven solutions. With the right approach, organisations can unlock the value of AI safely, without compromising security, compliance, or trust.
Get in touch with our business development team to find out how we can help you design and deliver secure AI solutions.
Written by Kyle Anderson, Associate Power Platform Developer