As AI engineers, our algorithms have the power to shape the social structures in which our users exist. By making some processes easier, we encourage patterns of behaviour that come to influence how people interact and interpret their worlds. All algorithms are a part of the social context in which they act but most complex are those that feed back information from the environment in an active process of co-shaping. We believe that, because of that impact, we have a deep responsibility to incorporate socio-technical analysis into how we think about our work.
If algorithms make meaningful decisions then their impact must be of consequence. So, how do we best understand those consequences?
One of the challenges of socio-technical analysis is that it requires a kind of 'throughline' between the abstract sociological and the mechanics of the technological. To overcome that, we are using and building our product so that we can talk more deeply about these problems. The basic idea is that we can use information known to us during testing, training and developing to extract the data needed for sociological analysis. Because it is generated from our code, our analysis is inherently connected to (i.e. generated by) the technology of our learning pipeline.
You can think of our work as playing the role of the hyphen in socio---decoded.ai---technical.
A lot of the pages here contain embeddings of live 'decodings' that help us show you what we're talking about.
Last modified: 2022-11-07
AI is a technology cluster; but it is also a cultural practice of building trust in autonomous systems. It is unique because it responds to and impacts people in an active process of co-shaping: machines learn from us and we learn from them. So long as there are gaps in our understanding of how to intentionally wield that process, AI is more likely to shape us in unexpected and uncontrolled ways.
How do we build AI worthy of our trust and what do we risk when we trust AI?
This interaction is the crux of AI. To master it, we need to deeply understand the machinations of the machines that we build as they interact with societal structures. But more importantly, we have a social responsibility to control that process in defined, challengable and open ways to avoid encouraging negative patterns of behaviour.
Over time, our users develop patterns of reliance through interplay with our system. Concurrently, our system learns a 'better' internal policy based on a pre-selected objective that we have encoded. These two actors feed information and respond to each other in an active process of co-shaping that converges towards... something.
The pace and direction of that convergence is intrinsically linked to their trust in our system. Consequently, the degree of control that we can exert over that process is directly connected to the degree of trust that a person has in our technology.
Consider a chatbot for at-home pain assessment that asks:
💬 On a scale of 1-10, describe your pain. If the pain is a 9 or 10, contact emergency services.
If our users trust that system, they might:
- self-censor at the clinic by saying that their pain is less than 8 (or else they would have called for an ambulance)
- avoid going to the clinic entirely depending on the pain assessment
Our chatbot is simultaneously adapting itself towards whatever we have encoded as 'better'. So, how do we assess the pace and direction of co-shaping between our chatbot and our users? Where will it be in three months after tens of thousands of people have interacted with it?
The pinnacle of trust is reliance. When a person truly trusts something, they will rely on it to do something of relative importance. Trust matters because it is deeply connected with adoption.
Trust is critical in determining how quickly, to what extent and in what direction that co-shaping leads us. Yet trust is difficult to engineer because it emphasises an intensely personal experience of interrogation and imagination rather than a measurable characteristic of a system.
Trust is the extent to which a person commits to a narrative of reliance. Trustworthiness refers to all of the things that we do to make that narrative easy to construct and tell.
What makes a system worthy of trust?
We can take a limited view trustworthiness through three modes of socio-technical analysis:
- Trustworthy by Knowledge of Limitation
- Trustworthy by Proxy of Authority
- Trustworthy by Resistance to Corruption
Last modified: 2022-11-07
Written in reaction to a series of conversations.
Conversations about Power & AI can produce a unique kind of chaos that we tend to avoid. For example:
Should AI engineers be uniquely responsible for irresponsible AI?
I think that this is the most divisive question in Trustworthy AI but I care about it because it forces us to discuss the role of power in AI/ethics.
Often someone will say that AI/ethics is the collective responsibility of everyone who touches it, so:
(a)no-one is really responsible because everyone is responsible.(b)it's all relative, so no-one can really say.(c)what is the responsibility of each person in that chain?
We need to rebel against (a)/(b) because it diffuses responsibility amongst an amorphous group yet, in my experience, these are the options that most of us take.
I believe that the pressure to reject the idea of unique responsibility is a reaction to the looming argument about the unique position of an AI engineer as a potential 'crumple zone'.
Toby Walsh (Machines Behaving Badly), for example, writes that:
power does not trump ethics… focusing on power rather than ethical concerns brings risks [like] alienating friendly voices within those power structures.
That sense of alienation is a full-stop in a conversation and such a position on power/ethics immediately yields problems.
Walsh, for instance, goes on to write that it is important to recognise that “there is no universal set of values with which we need to align our system [and that] it is not a simple dichotomy”.
Clearly, to navigate that kind of complexity, we need to think of AI ethics as a kind of discourse about a system frozen at a point in time.
... we should think of [AI ethics] as an activity: a type of reasoning... about what the world ought to look like. ... some ethical arguments about AI may ultimately be more persuasive than others... [it is] indeed constructive for AI ethics to welcome value pluralism
Annette Zimmermann, Bendert Zevenbergen, Freedom to Tinker (Princeton)
Yet such discourse is itself an exercise of power and persuasion its outcome!
When we retreat from a conversation of power in AI ethics we find ourselves caught in paradoxes and confusion. We have a desperate need to engage in an understanding of discursive power in AI ethics and yet a fundamental problem that these discussions alienate and divide.
So long as we choose full-stops we will trace out the outer limits of power structures in AI, inadvertently amplifying their significance in our aversion to it.
To progress Trustworthy AI, we need to re-invite a discussion of power back into the conversation.
A Brief Note on 'Power'
I see power as the totality of externality that coerces, denies and imposes. To be clear, I do not find the construction of power as an attribute compelling.
A Brief Note on 'AI'
Rather than a technology cluster, I interpret 'AI' as a cultural practice of ceding our decision-making capacity to autonomous systems that co-shape us.
Challenging conversations in AI ethics begin to make sense when we recognise that AI uniquely projects and encodes power unlike other highly-available and commercial technologies:
Power is exercised when [we] participate in the making of a decision that affects [someone else]… but power is also exercised [to the extent that someone else] is prevented… from bringing to the fore any issues… detrimental to [our] set of preferences.
Bachrach and Baratz (1962), edited
Historically, our exercise of power through decision-making has been tempered by physical constraints. AI is fearsome because it can impose our decisions at massive scale and a micro-granularity in unforeseen and chaotic contexts.
So long as we are actively encoding our capacity to coerce and deny into such a technology, we have a unique responsibility to understand how power refracts through our work.
When we talk about AI ethics we are exercising our own discursive power to coerce the development of an AI system towards our own preferences.
If we deny the role of power in AI, then we freeze the ethical discourse at the point where power becomes an unavoidable part of the conversation and exclude others from participating in the management of power in AI.
...focusing on power rather than ethical concerns brings risks [like] alienating friendly voices within those power structures
The only people who can talk about power without talking about it are those who benefit from power structures. By forcing any discussion of power to be a point of alienation, we are reinforcing existing power structures that exacerbate historical inequities.
Knowledge holds an intrinsic power-effect within the formative struggles of discussions about AI ethics and those with knowledge are more likely to shape outcomes to their own preferences.
... to understand the algorithms that go into deevloping the AI technology... we felt that we needed to work closely with... the technical experts that [are] actually developing the AI project... so that we can have them interpret for us... whether the algorithms are meeting the outcomes that were expected
Participant in Comptroller-General Forum on AI Oversight, Page 76 GAO-519-21SP Artificial Intelligence Accountability Framework
If AI ethics is a formative struggle by which we arrive at more responsible AI, then the prototypical 'AI Engineer' is characterised as a person who holds unique knowledge about an AI system.
In practice, their knowledge is persuasive and accords them a special role as a kind of arbiter of truth about the system.
Should AI engineers should be responsible for irresponsible AI?
Conversations about accountability in AI introduce an uncomfortable tension because we recognise that the power embedded in the unique knowledge of the 'AI Engineer' accords them a unique responsibility.
The 'AI Engineer', then, has a role not only for the mechanical process of encoding decision-making power into an 'AI' system but also in the discursive process of ethical reasoning.
So long as the knowledge of AI systems remains predominantly their domain, AI engineers are likely to be burdened with an enormous responsibility.
Our goal is to make their knowledge about a system more accessible to more participants at scale so that both the AI engineer and participants in the discursive process of ethical reasoning are unburdened.
Last modified: 2022-11-07
GAO-21-519SP is an accountability framework for Artificial Intelligence:
To help managers ensure accountability and responsible use of artificial intelligence (AI) in government programs and processes, GAO developed an AI accountability framework. This framework is organized around four complementary principles, which address governance, data, performance, and monitoring. For each principle, the framework describes key practices for federal agencies and other entities that are considering, selecting, and implementing AI systems. Each practice includes a set of questions for entities, auditors, and third-party assessors to consider, as well as procedures for auditors and third- party assessors.
Principle 1 frames risk minimisation in AI systems as a management-led process spread across organisational and system levels. These processes typically emphasise engagement between teams with a high-degree of mechanical autonomy guided by appropriate documentation.
At the same time, many audit procedures in Principle 1 require specifity, measurability and repeatability in the mechanical application of those high-level designs.
Extract, Page 31:
"This documentation process begins in the machine learning system design and set up stage, including system framing and high-level objective design".
Extract, Page 32:
"Review goals and objectives to each AI system to assess whether they are specific, measurable and ... clearly define what is to be achieved, who is to achieve it, how it will be achieved, and the time frames"
The tension between these two 'levels' of framing can frustrate the audit process.
Define clear goals and objectives for the AI system to ensure intended outcomes are achieved
The design and implementation plan of an AI system should naturally give rise to a suite of acceptance tests that consider the system a black-box. Those tests should initially fail and the project should not be considered 'completed' unless those tests are passed (accepted).
Say our implementation is using a neural net:
import numpy as np VectorType = np.array TensorType = np.array class ANN: def forward(self, X: TensorType) -> VectorType: """ Operates on an input to generate the intended output. """ def test_accuracy(): """ The model should be accurate on the test set. """ def test_real_time(): """ The model should meet performance requirements for 'real time' system. """
Acceptance tests encode both our goals and objectives as well as tell us about the purpose of the system. Remember, engineers will optimise towards these tests so it is important to assess whether these tests are reliable when over-optimised to the extreme.
Decoded.AI can help frame acceptance tests in language familiar to GAO-21-519SP as well as unpack how metrics are computed so that their implications are better understood.
We start by adding some instrumentation into the code that helps us measure and understand the computation. For example, functions like
test_accuracyare instrumented as acceptance tests based on the
test_prefix! When it runs, our system collects that information and transforms it into micro-frontends like the one above.
These are five (5) of the questions to consider:
What goals and objectives does the entity expect to achieve by designing, developing, and/or deploying the AI system?
The goal is to build a model that is accurate on some kind of task with an inference speed on a CPU of less than 200ms and with no more than 20Gb RAM.
The stated goals are specific, measurable and clear. Whilst they specify what is to be achieved, they do not say how, who is to achieve it or within what time frame.
To what extent do stated goals and objectives represent a balanced set of priorities and adequately reflect stated values?
The stated goals and objectives are unlikely to represent a balanced set of priorities. They are disproportionately weighted towards engineering considerations such as accuracy.
To what extent does the entity communicate its AI strategic goals and objectives to the community of stakeholders?
Stated goals and objectives are accessible over a web-browser
To what extent does the entity consistently measure progress towards stated goals and objectives?
States goals and objectives are consistently measured.
One of the benefits of using an RAI tool like Decoded.AI is that you get things like communicating with stakeholders out-of-the-box.
It's good to know as early as possible in development when a project is struggling to meet Responsible AI practices whilst systems, practices and ideas are still open to change.
To what extent does the entity have the necessary resources—funds, personnel, technologies, and time frames—to achieve the goals and objectives outlined for designing, developing and deploying the AI system?
One of the hardest parts of robustly developing an AI system is understanding how cost pressures will force changes in an otherwise well-designed plan. For AI developers, things like waiting for feedback or eagerly optimising for scaling problems can generate an intense pressure on the project lifecycle that is difficult to predict early on in development. One way to support that is to understand the cost structure of a computation by modelling future changes. If we have a rough estimate for the time taken for an attempt, how much an attempt costs (compute + engineering hours) and some idea of how many attempts might be necessary then we can roughly predict whether a project is within our capacity. We can also predict how likely future scaling challenges are to disrupt project trajectory.
Last modified: 2022-11-07