Ensuring Trustworthy AI in Healthcare

Which governance structures should countries implement to facilitate and regulate AI adoption?

4 min readJan 24, 2021

Photo by National Cancer Institute on Unsplash

Given the many public examples of AI behaving badly to date, members of the public are concerned about the risks of AI development and do not trust organizations to govern themselves effectively. In response to growing concerns, companies, governments, and potential adopters of AI technologies have described principles they believe should be followed in order to make a product safe and trustworthy. An example of this is the NHS’s “guide to good practice for digital and data-driven health technologies”. However, we need systems in place that go beyond high-level principles to make sure that AI is developed in a responsible way.

Defining standards is key to effective governance and trust. Current AI development guidelines usually agree on high-level principles to guide safe AI development but disagree over the details of what should be done in practice. For example, transparency is important, but what does transparency look like, and how is it achieved? Is it through open data, open code, explainable predictions…?

Governance principles

To lay out effective governance mechanisms, you first need to define what is important when developing and evaluating a system. The Centre for the Fourth Industrial Revolution at the World Economic Forum put together a set of principles for developing chatbots for health, which provides an excellent example of a well-defined and comprehensive set of principles (summarized below):

Safety: The device should not cause harm to patients
Efficacy: The device should be tailored to users and provide a proven benefit
Data protection: Data should be collected with consent, safeguarded, and disposed of properly
Human agency: The device allows for oversight, and freedom of choice by patient and practitioner
Accountability: Device behaviour should be auditable and some entity should be responsible for the algorithm’s behaviour
Transparency: Humans must always be aware if they are interacting with an AI, and its limitations should be made clear
Fairness: Training data should be representative of the population, and device behaviour should not be prejudiced against any group
Explainability: Decisions must be explained in an understandable way to intended users
Integrity: Decisions should be limited to those based on reliable, high-quality evidence/data, ethically sourced data, and data collected for a clearly defined purpose
Inclusiveness: The device should be accessible to all intended users, with particular consideration of excluded/vulnerable groups
- WEF governance of chatbots in healthcare framework, 2020

Governance mechanisms

Different tools can be used to ensure AI is developed responsibly, each targeting different stakeholders and ranging from social norms to laws. The approaches below are an adaptation and expansion of the mechanisms described in this paper:

For developers

Guides to good practice: produced by developers, providers, or regulators to establish actions developers can take to build trustworthy AI. Often, however, these guides remain focussed on high-level values and are open to interpretation in terms of execution
Algorithmic risk/impact assessments: these are designed to assess the possible societal impacts of an algorithmic system before or after the system is in use
Third-party auditing: a structured process by which an organization’s behaviour is assessed for consistency with expected or required behaviour in that industry. This can also involve an objective assessment of the AI’s performance against standard metrics
Red-teaming: often performed by dedicated “red teams” that make an effort to find flaws and vulnerabilities in a plan, organization, or system. In the case of medical AI, this could involve adversarial attacks, or exploring case studies that may reveal bias
Bias and safety bounties: these give people outside the organization a method and incentives for raising concerns about specific AI systems in a formalized way
Sharing of AI incidents: currently, this is seen mainly in the form of investigative journalism but could be practised more widely by developers and providers themselves, to improve societal understanding of how AI can behave in unexpected or undesired ways

For the software itself

Audit trails: creating a traceable log of steps taken in the development and testing of an AI system operation, or in its behaviour after deployment
Interpretability: engineering an explanation of the AI’s decision-making process into the reporting process, which aids in understanding and scrutiny of the AI system’s characteristics
Privacy-preserving ML: software features that protect the security of input data, model output, and the model itself

Other considerations

Do we need to be more deliberately pursuing AI literacy? With whom? (regulators, decision-makers, the public)
What legal powers may be missing to enable regulatory inspection of algorithmic systems?
How do we include marginalized and minority groups who may be most impacted by negative effects in the conversation?

A guide to good practice for digital and data-driven health technologies

Across the country and around the globe, digital innovators are helping us deliver our commitment to the digital…

www.gov.uk

Evaluating digital health products

It's important to conduct evaluations for all digital health products. Evaluation can: help you to demonstrate your…

www.gov.uk

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims

With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale…

arxiv.org