Advai: Operational Boundaries Calibration for AI Systems via Adversarial Robustness Techniques

Case study from Advai.

From:: Department for Science, Innovation and Technology
Published: 12 December 2023

Use case:: Big data analytics, Machine learning, Deep learning, Virtual agents or artificial conversational interfaces, Robotic process automation and decision management and Robotics and autonomous vehicles/systems
Sector:: Manufacturing (SIC Code Section C), Energy & Utilities (SIC Code Sections D & E), Retail (SIC Code Section G), Transportation & Storage (SIC Code Section H) and Accommodation and Food Service (SIC Code Section I)
Show 7 more
Digital & Comms (SIC Code Section J), Financial and Insurance (SIC Code Section K), Real Estate (SIC Code Section L), Professional, Scientific & Professional Activities (SIC Code Section M), Administrative & Support Services (SIC Code Section N), Public Administration & Defence (SIC Code Section O), and Other Services (SIC Code Section S)
Principle:: Safety, security and robustness, Appropriate transparency and explainability, Fairness and Accountability and governance
Key function:: R&D, Product and service development, Risk management and Strategy and corporate finance
AI Assurance Technique:: Data assurance, Compliance audit, Formal verification, Performance testing, Risk Assessment and Bias Audit
Assurance Technique Approach:: Technical, Procedural and Educational

Background & Description

To enable AI systems to be deployed safely and effectively in enterprise environments, there must be a solid understanding of their fault tolerances in response to adversarial stress-testing methods.

Our stress-testing tools identifies vulnerabilities from these two broad categories of AI failure:

Natural, human-meaningful vulnerabilities encompass failure modes that a human could hypothesise, e.g. a computer vision system struggling with a skewed, foggy, or rotated image.
Adversarial vulnerabilities, pinpoint where minor yet unexpected parameter variations can induce failure. 皇冠体育appse vulnerabilities not only reveal potential attack vectors but also signal broader system fragility. It鈥檚 worth noting that the methods for detecting adversarial vulnerabilities can often reveal natural failure modes, too.

皇冠体育app process begins with 鈥渏ailbreaking鈥� AI models, a metaphor for stress-testing them to uncover hidden flaws. This involves presenting the system with a range of adversarial inputs to identify at what points the AI fails or when it responds in unintended ways. 皇冠体育appse adversarial inputs are crafted using state-of-the-art techniques that simulate potential real-world attacks or unexpected inputs that the system may encounter.

Advai鈥檚 adversarial robustness framework then defines a model鈥檚 operational limits 鈥� points beyond which a system is likely to fail. This use case captures our approach to calibrating the operational use of AI systems according to their points of failure.

How this technique applies to the AI White Paper Regulatory Principles

Safety, Security & Robustness

Proactive adversarial testing pushes AI systems to their limits, ensuring that safety margins are understood. This contributes to an organisation鈥檚 ability to calibrate their use of AI systems within safe and secure parameters.

Appropriate Transparency & Explainability

Pinpointing the precise causes of failure is an exercise in explainability. 皇冠体育app adversarial approach teases out errors in AI decision-making, promoting transparency and helping stakeholders understand how AI conclusions are reached.

Fairness

皇冠体育app framework is designed to align model use with organisational objectives. After all, 鈥楢I failure鈥� is by nature a deviation from an organisational objective. 皇冠体育appse objectives naturally include fairness related criteria, such as preventing bias-free models and promoting equitable outcomes.

Accountability & Governance

Attacks are designed to discover key points of failure and this information arms the managers responsible for overseeing those models with the ability to make better deployment decisions. Thus the assignment of an individual manager responsible for defining suitable operational parameters improves governance. 皇冠体育app adversarial findings and automated documentation of system use also create an auditable trail.

Why we took this approach

Adversarial robustness testing is the gold standard for stress-testing AI systems in a controlled and empirical manner. It not only exposes potential weaknesses but also confirms the precise conditions under which the AI system can be expected to perform unreliably, guiding the formulation of precise operational boundaries.

Benefits to the organisation using the technique

Enhanced predictability and reliability of AI systems that are used within their operational scope, leading to increased trust from users and stakeholders.
A more objective risk profile that can be communicated across the organisation, helping technical and non-technical stakeholders align on organisational need and model deployment decisions.
Empowerment of the organisation to enforce an AI posture that meets industry regulations and ethical standards through informed boundary-setting.

Limitations of the approach

While adversarial testing is thorough, it is not exhaustive and might not account for every conceivable scenario, especially under rapidly evolving conditions.
皇冠体育app process requires expert knowledge and continuous re-evaluation to keep pace with technological advancements and emerging threat landscapes.
Internal expertise is needed to match the failure induced by adversarial methods with the organisation鈥檚 appetite for risk in a given use-case.
皇冠体育appre is a trade-off between the restrictiveness of operational boundaries and the AI鈥檚 ability to learn and adapt; overly strict boundaries may inhibit the system鈥檚 growth and responsiveness to new data.

Further Links (including relevant standards)

Further AI Assurance Information

For more information about other techniques visit the CDEI Portfolio of AI Assurance Tools: /ai-assurance-techniques
For more information on relevant standards visit the AI Standards Hub:

Published 12 December 2023

Contents

皇冠体育app