Talks - Leapfroglog

Democratizing AI Through Continuous Adaptability: The Role of DevOps

Below are the abstract and slides for my contribution to the TILTing Perspectives 2024 panel “The mutual shaping of democratic practices & AI,” moderated by Merel Noorman.

Slides

democratizing-ai-through-continuous-adaptability-the-role-of-devops-tilting-perspectives-2024 Download

Abstract

Contestability

This presentation delves into democratizing artificial intelligence (AI) systems through contestability. Contestability refers to the ability of AI systems to remain open and responsive to disputes throughout their lifecycle. It approaches AI systems as arenas where groups compete for power over designs and outcomes.

Autonomy, democratic agency, legitimation

We identify contestability as a critical system quality for respecting people’s autonomy. This includes their democratic agency: their ability to legitimate policies. This includes policies enacted by AI systems.

For a decision to be legitimate, it must be democratically willed or rely on “normative authority.” The democratic pathway should be constrained by normative bounds to avoid arbitrariness. The appeal to authority should meet the “access constraint,” which ensures citizens can form beliefs about policies with a sufficient degree of agency (Peter, 2020 in Rubel et al., 2021).

Contestability is the quality that ensures mechanisms are in place for subjects to exercise their democratic agency. In the case of an appeal to normative authority, contestability mechanisms are how subjects and their representatives gain access to the information that will enable them to evaluate its justifiability. In this way, contestability satisfies the access constraint. In the case of democratic will, contestability-by-design practices are how system development is democratized. The autonomy account of legitimation adds the normative constraints that should bind this democratic pathway.

Himmelreich (2022) similarly argues that only a “thick” conception of democracy will address some of the current shortcomings of AI development. This is a pathway that not only allows for participation but also includes deliberation over justifications.

The agonistic arena

Elsewhere, we have proposed the Agonistic Arena as a metaphor for thinking about the democratization of AI systems (Alfrink et al., 2024). Contestable AI embodies the generative metaphor of the Arena. This metaphor characterizes public AI as a space where interlocutors embrace conflict as productive. Seen through the lens of the Arena, public AI problems stem from a need for opportunities for adversarial interaction between stakeholders.

This metaphorical framing suggests prescriptions to make more contentious and open to dispute the norms and procedures that shape:

AI system design decisions on a global level, and
human-AI system output decisions on a local level (i.e., individual decision outcomes), establishing new dialogical feedback loops between stakeholders that ensure continuous monitoring.

The Arena metaphor encourages a design ethos of revisability and reversibility so that AI systems embody the agonistic ideal of contingency.

Post-deployment malleability, feedback-ladenness

Unlike physical systems, AI technologies exhibit a unique malleability post-deployment.

For example, LLM chatbots optimize their performance based on a variety of feedback sources, including interactions with users, as well as feedback collected through crowd-sourced data work.

Because of this open-endedness, democratic control and oversight in the operations phase of the system’s lifecycle become a particular concern.

This is a concern because while AI systems are dynamic and feedback-laden (Gilbert et al., 2023), many of the existing oversight and control measures are static, one-off exercises that struggle to track systems as they evolve over time.

DevOps

The field of DevOps is pivotal in this context. DevOps focuses on system instrumentation for enhanced monitoring and control for continuous improvement. Typically, metrics for DevOps and their machine learning-specific MLOps offshoot emphasize technical performance and business objectives.

However, there is scope to expand these to include matters of public concern. The matters-of-concern perspective shifts the focus on issues such as fairness or discrimination, viewing them as challenges that cannot be resolved through universal methods with absolute certainty. Rather, it highlights how standards are locally negotiated within specific institutional contexts, emphasizing that such standards are never guaranteed (Lampland & Star, 2009, Geiger et al., 2023).

MLOps Metrics

In the context of machine learning systems, technical metrics focus on model accuracy. For example, a financial services company might use Area Under The Curve Receiver Operating Characteristics (AUC-ROC) to continuously monitor and maintain the performance of their fraud detection model in production.

Business metrics focus on cost-benefit analyses. For example, a bank might use a cost-benefit matrix to balance the potential revenue from approving a loan against the risk of default, ensuring that the overall profitability of their loan portfolio is optimized.

Drift

These metrics can be monitored over time to detect “drift” between a model and the world. Training sets are static. Reality is dynamic. It changes over time. Drift occurs when the nature of new input data diverges from the data a model was trained on. A change in performance metrics may be used to alert system operators, who can then investigate and decide on a course of action, e.g., retraining a model on updated data. This, in effect, creates a feedback loop between the system in use and its ongoing development.

An expansion of these practices in the interest of contestability would require:

setting different metrics,
exposing these metrics to additional audiences, and
establishing feedback loops with the processes that govern models and the systems they are embedded in.

Example 1: Camera Cars

Let’s say a city government uses a camera-equipped vehicle and a computer vision model to detect potholes in public roads. In addition to accuracy and a favorable cost-benefit ratio, citizens, and road users in particular, may care about the time between a detected pothole and its fixing. Or, they may care about the distribution of potholes across the city. Furthermore, when road maintenance appears to be degrading, this should be taken up with department leadership, the responsible alderperson, and council members.

Example 2: EV Charching

Or, let’s say the same city government uses an algorithmic system to optimize public electric vehicle (EV) charging stations for green energy use by adapting charging speeds to expected sun and wind. EV drivers may want to know how much energy has been shifted to greener time windows and its trends. Without such visibility on a system’s actual goal achievement, citizens’ ability to legitimate its use suffers. As I have already mentioned, democratic agency, when enacted via the appeal to authority, depends on access to “normative facts” that underpin policies. And finally, professed system functionality must be demonstrated as well (Raji et al., 2022).

DevOps as sociotechnical leverage point for democratizing AI

These brief examples show that the DevOps approach is a potential sociotechnical leverage point. It offers pathways for democratizing AI system design, development, and operations.

DevOps can be adapted to further contestability. It creates new channels between human and machine actors. One of DevOps’s essential activities is monitoring (Smith, 2020), which presupposes fallibility, a necessary precondition for contestability. Finally, it requires and provides infrastructure for technical flexibility so that recovery from error is low-cost and continuous improvement becomes practically feasible.

The mutual shaping of democratic practices & AI

Zooming out further, let’s reflect on this panel’s overall theme, picking out three elements: legitimation, representation of marginalized groups, and dealing with conflict and contestation after implementation and during use.

Contestability is a lever for demanding justifications from operators, which is a necessary input for legitimation by subjects (Henin & Le Métayer, 2022). Contestability frames different actors’ stances as adversarial positions on a political field rather than “equally valid” perspectives (Scott, 2023). And finally, relations, monitoring, and revisability are all ways to give voice to and enable responsiveness to contestations (Genus & Stirling, 2018).

And again, all of these things can be furthered in the post-deployment phase by adapting the DevOps lens.

Bibliography

Alfrink, K., Keller, I., Kortuem, G., & Doorn, N. (2022). Contestable AI by Design: Towards a Framework. Minds and Machines, 33(4), 613–639. https://doi.org/10/gqnjcs
Alfrink, K., Keller, I., Yurrita Semperena, M., Bulygin, D., Kortuem, G., & Doorn, N. (2024). Envisioning Contestability Loops: Evaluating the Agonistic Arena as a Generative Metaphor for Public AI. She Ji: The Journal of Design, Economics, and Innovation, 10(1), 53–93. https://doi.org/10/gtzwft
Geiger, R. S., Tandon, U., Gakhokidze, A., Song, L., & Irani, L. (2023). Making Algorithms Public: Reimagining Auditing From Matters of Fact to Matters of Concern. International Journal of Communication, 18(0), Article 0.
Genus, A., & Stirling, A. (2018). Collingridge and the dilemma of control: Towards responsible and accountable innovation. Research Policy, 47(1), 61–69. https://doi.org/10/gcs7sn
Gilbert, T. K., Lambert, N., Dean, S., Zick, T., Snoswell, A., & Mehta, S. (2023). Reward Reports for Reinforcement Learning. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 84–130. https://doi.org/10/gs9cnh
Henin, C., & Le Métayer, D. (2022). Beyond explainability: Justifiability and contestability of algorithmic decision systems. AI & SOCIETY, 37(4), 1397–1410. https://doi.org/10/gmg8pf
Himmelreich, J. (2022). Against “Democratizing AI.” AI & SOCIETY. https://doi.org/10/gr95d5
Lampland, M., & Star, S. L. (Eds.). (2008). Standards and Their Stories: How Quantifying, Classifying, and Formalizing Practices Shape Everyday Life (1st edition). Cornell University Press.
Peter, F. (2020). The Grounds of Political Legitimacy. Journal of the American Philosophical Association, 6(3), 372–390. https://doi.org/10/grqfhn
Raji, I. D., Kumar, I. E., Horowitz, A., & Selbst, A. (2022). The Fallacy of AI Functionality. 2022 ACM Conference on Fairness, Accountability, and Transparency, 959–972. https://doi.org/10/gqfvf5
Rubel, A., Castro, C., & Pham, A. K. (2021). Algorithms and autonomy: The ethics of automated decision systems. Cambridge University Press.
Scott, D. (2023). Diversifying the Deliberative Turn: Toward an Agonistic RRI. Science, Technology, & Human Values, 48(2), 295–318. https://doi.org/10/gpk2pr
Smith, J. D. (2020). Operations anti-patterns, DevOps solutions. Manning Publications.
Treveil, M. (2020). Introducing MLOps: How to scale machine learning in the enterprise (First edition). O’Reilly.

AI pedagogy through a design lens

At a TU Delft spring symposium on AI education, Hosana and I ran a short workshop titled “AI pedagogy through a design lens.” In it, we identified some of the challenges facing AI teaching, particularly outside of computer science, and explored how design pedagogy, particularly the practices of studios and making, may help to address them. The AI & Society master elective I’ve been developing and teaching over the past five years served as a case study. The session was punctuated by brief brainstorming using an adapted version of the SQUID gamestorming technique. Below are the slides we used.

AI pedagogy through a design lens Download

“Geen transparantie zonder tegenspraak” — betoog voor première documentaire transparante laadpaal

Het onderstaande korte betoog sprak ik uit tijdens de online premiere van de documentaire over de transparante laadpaal op donderdag 18 maart 2021.

Ik had laatst contact met een internationale “thought leader” op het gebied van “tech ethics”. Hij vertelde mij dat hij heel dankbaar is voor het bestaan van de transparante laadpaal omdat het zo’n goed voorbeeld is van hoe design kan bijdragen aan eerlijke technologie.

Dat is natuurlijk ontzettend leuk om te horen. En het past in een bredere trend in de industrie gericht op het transparant en uitlegbaar maken van algoritmes. Inmiddels is het zelfs zo ver dat wetgeving uitlegbaarheid (in sommige gevallen) verplicht stelt.

In de documentaire hoor je meerdere mensen vertellen (mijzelf inbegrepen) waarom het belangrijk is dat stedelijke algoritmes transparant zijn. Thijs benoemt heel mooi twee redenen: Enerzijds het collectieve belang om democratische controle op de ontwikkeling van stedelijke algoritmes mogelijk te maken. Anderzijds is er het individuele belang om je recht te kunnen halen als een systeem een beslissing maakt waarmee je het (om wat voor reden dan ook) niet eens bent.

En inderdaad, in beide gevallen (collectieve controle en individuele remedie) is transparantie een randvoorwaarde. Ik denk dat we met dit project een hoop problemen qua design en techniek hebben opgelost die daarbij komen kijken. Tegelijkertijd doemt er een nieuwe vraag aan de horizon op: Als we begrijpen hoe een slim systeem werkt, en we zijn het er niet mee eens, wat dan? Hoe krijg je vervolgens daadwerkelijk invloed op de werking van het systeem?

Ik denk dat we onze focus zullen moeten gaan verleggen van transparantie naar wat ik tegenspraak of in goed Engels “contestability” noem.

Ontwerpen voor tegenspraak betekent dat we na moeten gaan denken over de middelen die mensen nodig hebben voor het uitoefenen van hun recht op menselijke interventie. Ja, dit betekent dat we informatie moeten aanleveren over het hoe en waarom van individuele beslissingen. Transparantie dus. Maar het betekent ook dat we nieuwe kanalen en processen moeten inrichten waarmee mensen verzoeken kunnen indienen voor het herzien van een beslissing. We zullen na moeten gaan denken over hoe we dergelijke verzoeken beoordelen, en hoe we er voor zorgen dat het slimme systeem in kwestie “leert” van de signalen die we op deze manier oppikken uit de samenleving.

Je zou kunnen zeggen dat ontwerpen van transparantie eenrichtingsverkeer is. Informatie stroomt van de ontwikkelende partij, naar de eindgebruiker. Bij het ontwerpen voor tegenspraak gaat het om het creëren van een dialoog tussen ontwikkelaars en burgers.

Ik zeg burgers want niet alleen klassieke eindgebruikers worden geraakt door slimme systemen. Allerlei andere groepen worden ook, vaak indirect beïnvloed.

Dat is ook een nieuwe ontwerp uitdaging. Hoe ontwerp je niet alleen voor de eindgebruiker (zoals bij de transparante laadpaal de EV bestuurder) maar ook voor zogenaamde indirecte belanghebbenden, bijvoorbeeld bewoners van straten waar laadpalen geplaatst worden, die geen EV rijden, of zelfs geen auto, maar evengoed een belang hebben bij hoe stoepen en straten worden ingericht.

Deze verbreding van het blikveld betekent dat we bij het ontwerpen voor tegenspraak nóg een stap verder kunnen en zelfs moeten gaan dan het mogelijk maken van remedie bij individuele beslissingen.

Want ontwerpen voor tegenspraak bij individuele beslissingen van een reeds uitgerold systeem is noodzakelijkerwijs post-hoc en reactief, en beperkt zich tot één enkele groep belanghebbenden.

Zoals Thijs ook min of meer benoemt in de documentaire beïnvloed slimme stedelijke infrastructuur de levens van ons allemaal, en je zou kunnen zeggen dat de design en technische keuzes die bij de ontwikkeling daarvan gemaakt worden intrinsiek ook politieke keuzes zijn.

Daarom denk ik dat we er niet omheen kunnen om het proces dat ten grondslag ligt aan deze systemen zelf, ook zo in te richten dat er ruimte is voor tegenspraak. In mijn ideale wereld is de ontwikkeling van een volgende generatie slimme laadpalen daarom participatief, pluriform en inclusief, net als onze democratie dat zelf ook streeft te zijn.

Hoe we dit soort “contestable” algoritmes precies vorm moeten geven, hoe ontwerpen voor tegenspraak moeten gaan werken, is een open vraag. Maar een aantal jaren geleden wist niemand nog hoe een transparante laadpaal er uit zou moeten zien, en dat hebben we ook voor elkaar gekregen.

Update (2021–03-31 16:43): Een opname van het gehele event is nu ook beschikbaar. Het bovenstaande betoog start rond 25:14.

“Contestable Infrastructures” at Beyond Smart Cities Today

I’ll be at Beyond Smart Cities Today the next couple of days (18–19 September). Below is the abstract I submitted, plus a bibliography of some of the stuff that went into my thinking for this and related matters that I won’t have the time to get into.

In the actually existing smart city, algorithmic systems are increasingly used for the purposes of automated decision-making, including as part of public infrastructure. Algorithmic systems raise a range of ethical concerns, many of which stem from their opacity. As a result, prescriptions for improving the accountability, trustworthiness and legitimacy of algorithmic systems are often based on a transparency ideal. The thinking goes that if the functioning and ownership of an algorithmic system is made perceivable, people understand them and are in turn able to supervise them. However, there are limits to this approach. Algorithmic systems are complex and ever-changing socio-technical assemblages. Rendering them visible is not a straightforward design and engineering task. Furthermore such transparency does not necessarily lead to understanding or, crucially, the ability to act on this understanding. We believe legitimate smart public infrastructure needs to include the possibility for subjects to articulate objections to procedures and outcomes. The resulting “contestable infrastructure” would create spaces that open up the possibility for expressing conflicting views on the smart city. Our project is to explore the design implications of this line of reasoning for the physical assets that citizens encounter in the city. Because after all, these are the perceivable elements of the larger infrastructural systems that recede from view.

Alkhatib, A., & Bernstein, M. (2019). Street-Level Algorithms. 1–13. https://doi.org/10.1145/3290605.3300760
Ananny, M., & Crawford, K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media and Society, 20(3), 973–989. https://doi.org/10.1177/1461444816676645
Centivany, A., & Glushko, B. (2016). “Popcorn tastes good”: Participatory policymaking and Reddit’s “AMAgeddon.” Conference on Human Factors in Computing Systems — Proceedings, 1126–1137. https://doi.org/10.1145/2858036.2858516
Crawford, K. (2016). Can an Algorithm be Agonistic? Ten Scenes from Life in Calculated Publics. Science Technology and Human Values, 41(1), 77–92. https://doi.org/10.1177/0162243915589635
DiSalvo, C. (2010). Design, Democracy and Agonistic Pluralism. Proceedings of the Design Research Society Conference, 366–371.
Hildebrandt, M. (2017). Privacy As Protection of the Incomputable Self: Agonistic Machine Learning. SSRN Electronic Journal, 1–33. https://doi.org/10.2139/ssrn.3081776
Jackson, S. J., Gillespie, T., & Payette, S. (2014). The Policy Knot: Re-integrating Policy, Practice and Design. CSCW Studies of Social Computing, 588–602. https://doi.org/10.1145/2531602.2531674
Jewell, M. (2018). Contesting the decision: living in (and living with) the smart city. International Review of Law, Computers and Technology. https://doi.org/10.1080/13600869.2018.1457000
Lindblom, L. (2019). Consent, Contestability, and Unions. Business Ethics Quarterly. https://doi.org/10.1017/beq.2018.25
Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big Data & Society, 3(2), 205395171667967. https://doi.org/10.1177/2053951716679679
Van de Poel, I. (2016). An ethical framework for evaluating experimental technology. Science and Engineering Ethics, 22(3), 667–686. https://doi.org/10.1007/s11948-015‑9724‑3

“Contestable Infrastructures: Designing for Dissent in Smart Public Objects” at We Make the City 2019

Thijs Turèl of AMS Institute and myself presented a version of the talk below at the Cities for Digital Rights conference on June 19 in Amsterdam during the We Make the City festival. The talk is an attempt to articulate some of the ideas we both have been developing for some time around contestability in smart public infrastructure. As always with this sort of thing, this is intended as a conversation piece so I welcome any thoughts you may have.

The basic message of the talk is that when we start to do automated decision-making in public infrastructure using algorithmic systems, we need to design for the inevitable disagreements that may arise and furthermore, we suggest there is an opportunity to focus on designing for such disagreements in the physical objects that people encounter in urban space as they make use of infrastructure.

We set the scene by showing a number of examples of smart public infrastructure. A cyclist crossing that adapts to weather conditions. If it’s raining cyclists more frequently get a green light. A pedestrian crossing in Tilburg where elderly can use their mobile to get more time to cross. And finally, the case we are involved with ourselves: smart EV charging in the city of Amsterdam, about which more later.

Image credits: Vattenfall, Fietsfan010, De Nieuwe Draai

We identify three trends in smart public infrastructure: (1) where previously algorithms were used to inform policy, now they are employed to perform automated decision-making on an individual case basis. This raises the stakes; (2) distributed ownership of these systems as the result of public-private partnerships and other complex collaboration schemes leads to unclear responsibility; and finally (3) the increasing use of machine learning leads to opaque decision-making.

These trends, and algorithmic systems more generally, raise a number of ethical concerns. They include but are not limited to: the use of inductive correlations (for example in the case of machine learning) leads to unjustified results; lack of access to and comprehension of a system’s inner workings produces opacity, which in turn leads to a lack of trust in the systems themselves and the organisations that use them; bias is introduced by a number of factors, including development team prejudices, technical flaws, bad data and unforeseen interactions with other systems; and finally the use of profiling, nudging and personalisation leads to diminished human agency. (We highly recommend the article by Mittelstadt et al. for a comprehensive overview of ethical concerns raised by algorithms.)

So for us, the question that emerges from all this is: How do we organise the supervision of smart public infrastructure in a democratic and lawful way?

There are a number of existing approaches to this question. These include legal and regulatory (e.g. the right to explanation in the GDPR); auditing (e.g. KPMG’s “AI in Control” method, BKZ’s transparantielab); procurement (e.g. open source clauses); insourcing (e.g. GOV.UK) and design and engineering (e.g. our own work on the transparent charging station).

We feel there are two important limitations with these existing approaches. The first is a focus on professionals and the second is a focus on prediction. We’ll discuss each in turn.

First of all, many solutions target a professional class, be it accountants, civil servants, supervisory boards, as well as technologists, designers and so on. But we feel there is a role for the citizen as well, because the supervision of these systems is simply too important to be left to a privileged few. This role would include identifying wrongdoing, and suggesting alternatives.

There is a tension here, which is that from the perspective of the public sector one should only ask citizens for their opinion when you have the intention and the resources to actually act on their suggestions. It can also be a challenge to identify legitimate concerns in the flood of feedback that can sometimes occur. From our point of view though, such concerns should not be used as an excuse to not engage the public. If citizen participation is considered necessary, the focus should be on freeing up resources and setting up structures that make it feasible and effective.

The second limitation is prediction. This is best illustrated with the Collinridge dilemma: in the early phases of new technology, when a technology and its social embedding are still malleable, there is uncertainty about the social effects of that technology. In later phases, social effects may be clear but then often the technology has become so well entrenched in society that it is hard to overcome negative social effects. (This summary is taken from an excellent van de Poel article on the ethics of experimental technology.)

Many solutions disregard the Collingridge dilemma and try to predict and prevent adverse effects of new systems at design-time. One example of this approach would be value-sensitive design. Our focus in stead is on use-time. Considering the fact that smart public infrastructure tends to be developed on an ongoing basis, the question becomes how to make citizens a partner in this process. And even more specifically we are interested in how this can be made part of the design of the “touchpoints” people actually encounter in the streets, as well as their backstage processes.

Why do we focus on these physical objects? Because this is where people actually meet the infrastructural systems, of which large parts recede from view. These are the places where they become aware of their presence. They are the proverbial tip of the iceberg.

The use of automated decision-making in infrastructure reduces people’s agency. For this reason, resources for agency need to be designed back into these systems. Frequently the answer to this question is premised on a transparency ideal. This may be a prerequisite for agency, but it is not sufficient. Transparency may help you become aware of what is going on, but it will not necessarily help you to act on that knowledge. This is why we propose a shift from transparency to contestability. (We can highly recommend Ananny and Crawford’s article for more on why transparency is insufficient.)

To clarify what we mean by contestability, consider the following three examples: When you see the lights on your router blink in the middle of the night when no-one in your household is using the internet you can act on this knowledge by yanking out the device’s power cord. You may never use the emergency brake in a train but its presence does give you a sense of control. And finally, the cash register receipt provides you with a view into both the procedure and the outcome of the supermarket checkout procedure and it offers a resource with which you can dispute them if something appears to be wrong.

Image credits: Aangiftedoen, source unknown for remainder

None of these examples is a perfect illustration of contestability but they hint at something more than transparency, or perhaps even something wholly separate from it. We’ve been investigating what their equivalents would be in the context of smart public infrastructure.

To illustrate this point further let us come back to the smart EV charging project we mentioned earlier. In Amsterdam, public EV charging stations are becoming “smart” which in this case means they automatically adapt the speed of charging to a number of factors. These include grid capacity, and the availability of solar energy. Additional factors can be added in future, one of which under consideration is to give priority to shared cars over privately owned cars. We are involved with an ongoing effort to consider how such charging stations can be redesigned so that people understand what’s going on behind the scenes and can act on this understanding. The motivation for this is that if not designed carefully, the opacity of smart EV charging infrastructure may be detrimental to social acceptance of the technology. (A first outcome of these efforts is the Transparent Charging Station designed by The Incredible Machine. A follow-up project is ongoing.)

Image credits: The Incredible Machine, Kars Alfrink

We have identified a number of different ways in which people may object to smart EV charging. They are listed in the table below. These types of objections can lead us to feature requirements for making the system contestable.

Because the list is preliminary, we asked the audience if they could imagine additional objections, if those examples represented new categories, and if they would require additional features for people to be able to act on them. One particularly interesting suggestion that emerged was to give local communities control over the policies enacted by the charge points in their vicinity. That’s something to further consider the implications of.

And that’s where we left it. So to summarise:

Algorithmic systems are becoming part of public infrastructure.
Smart public infrastructure raises new ethical concerns.
Many solutions to ethical concerns are premised on a transparency ideal, but do not address the issue of diminished agency.
There are different categories of objections people may have to an algorithmic system’s workings.
Making a system contestable means creating resources for people to object, opening up a space for the exploration of meaningful alternatives to its current implementation.

ThingsCon 2018 workshop ‘Seeing Like a Bridge’

Workshop in progress with a view of Rotterdam’s Willemsbrug across the Maas.

Early December of last year Alec Shuldiner and myself ran a workshop at ThingsCon 2018 in Rotterdam.

Here’s the description as it was listed on the conference website:

In this workshop we will take a deep dive into some of the challenges of designing smart public infrastructure.
Smart city ideas are moving from hype into reality. The everyday things that our contemporary world runs on, such as roads, railways and canals are not immune to this development. Basic, “hard” infrastructure is being augmented with internet-connected sensing, processing and actuating capabilities. We are involved as practitioners and researchers in one such project: the MX3D smart bridge, a pedestrian bridge 3D printed from stainless steel and equipped with a network of sensors.
The question facing everyone involved with these developments, from citizens to professionals to policy makers is how to reap the potential benefits of these technologies, without degrading the urban fabric. For this to happen, information technology needs to become more like the city: open-ended, flexible and adaptable. And we need methods and tools for the diverse range of stakeholders to come together and collaborate on the design of truly intelligent public infrastructure.
We will explore these questions in this workshop by first walking you through the architecture of the MX3D smart bridge—offering a uniquely concrete and pragmatic view into a cutting edge smart city project. Subsequently we will together explore the question: What should a smart pedestrian bridge that is aware of itself and its surroundings be able to tell us? We will conclude by sharing some of the highlights from our conversation, and make note of particularly thorny questions that require further work.

The workshop’s structure was quite simple. After a round of introductions, Alec introduced the MX3D bridge to the participants. For a sense of what that introduction talk was like, I recommend viewing this recording of a presentation he delivered at a recent Pakhuis de Zwijger event.

We then ran three rounds of group discussion in the style of world cafe. each discussion was guided by one question. Participants were asked to write, draw and doodle on the large sheets of paper covering each table. At the end of each round, people moved to another table while one person remained to share the preceding round’s discussion with the new group.

The discussion questions were inspired by value-sensitive design. I was interested to see if people could come up with alternative uses for a sensor-equipped 3D-printed footbridge if they first considered what in their opinion made a city worth living in.

The questions we used were:

What specific things do you like about your town? (Places, things to do, etc. Be specific.)
What values underly those things? (A value is what a person or group of people consider important in life.)
How would you redesign the bridge to support those values?

At the end of the three discussion rounds we went around to each table and shared the highlights of what was produced. We then had a bit of a back and forth about the outcomes and the workshop approach, after which we wrapped up.

We did get to some interesting values by starting from personal experience. Participants came from a variety of countries and that was reflected in the range of examples and related values. The design ideas for the bridge remained somewhat abstract. It turned out to be quite a challenge to make the jump from values to different types of smart bridges. Despite this, we did get nice ideas such as having the bridge report on water quality of the canal it crosses, derived from the value of care for the environment.

The response from participants afterwards was positive. People found it thought-provoking, which was definitely the point. People were also eager to learn even more about the bridge project. It remains a thing that captures people’s imagination. For that reason alone, it continues to be a very productive case to use for the grounding of these sorts of discussions.

‘Unboxing’ at Behavior Design Amsterdam #16

Below is a write-up of the talk I gave at the Behavior Design Amsterdam #16 meetup on Thursday, February 15, 2018.

'Pandora' by John William Waterhouse (1896) — ‘Pandora’ by John William Waterhouse (1896)

I’d like to talk about the future of our design practice and what I think we should focus our attention on. It is all related to this idea of complexity and opening up black boxes. We’re going to take the scenic route, though. So bear with me.

Software Design

Two years ago I spent about half a year in Singapore.

While there I worked as product strategist and designer at a startup called ARTO, an art recommendation service. It shows you a random sample of artworks, you tell it which ones you like, and it will then start recommending pieces it thinks you like. In case you were wondering: yes, swiping left and right was involved.

We had this interesting problem of ingesting art from many different sources (mostly online galleries) with metadata of wildly varying levels of quality. So, using metadata to figure out which art to show was a bit of a non-starter. It should come as no surprise then, that we started looking into machine learning—image processing in particular.

And so I found myself working with my engineering colleagues on an art recommendation stream which was driven at least in part by machine learning. And I quickly realised we had a problem. In terms of how we worked together on this part of the product, it felt like we had taken a bunch of steps back in time. Back to a way of collaborating that was less integrated and less responsive.

That’s because we have all these nice tools and techniques for designing traditional software products. But software is deterministic. Machine learning is fundamentally different in nature: it is probabilistic.

It was hard for me to take the lead in the design of this part of the product for two reasons. First of all, it was challenging to get a first-hand feel of the machine learning feature before it was implemented.

And second of all, it was hard for me to communicate or visualise the intended behaviour of the machine learning feature to the rest of the team.

So when I came back to the Netherlands I decided to dig into this problem of design for machine learning. Turns out I opened up quite the can of worms for myself. But that’s okay.

There are two reasons I care about this:

The first is that I think we need more design-led innovation in the machine learning space. At the moment it is engineering-dominated, which doesn’t necessarily lead to useful outcomes. But if you want to take the lead in the design of machine learning applications, you need a firm handle on the nature of the technology.

The second reason why I think we need to educate ourselves as designers on the nature of machine learning is that we need to take responsibility for the impact the technology has on the lives of people. There is a lot of talk about ethics in the design industry at the moment. Which I consider a positive sign. But I also see a reluctance to really grapple with what ethics is and what the relationship between technology and society is. We seem to want easy answers, which is understandable because we are all very busy people. But having spent some time digging into this stuff myself I am here to tell you: There are no easy answers. That isn’t a bug, it’s a feature. And we should embrace it.

Machine Learning

At the end of 2016 I attended ThingsCon here in Amsterdam and I was introduced by Ianus Keller to TU Delft PhD researcher Péter Kun. It turns out we were both interested in machine learning. So with encouragement from Ianus we decided to put together a workshop that would enable industrial design master students to tangle with it in a hands-on manner.

About a year later now, this has grown into a thing we call Prototyping the Useless Butler. During the workshop, you use machine learning algorithms to train a model that takes inputs from a network-connected arduino’s sensors and drives that same arduino’s actuators. In effect, you can create interactive behaviour without writing a single line of code. And you get a first hand feel for how common applications of machine learning work. Things like regression, classification and dynamic time warping.

The thing that makes this workshop tick is an open source software application called Wekinator. Which was created by Rebecca Fiebrink. It was originally aimed at performing artists so that they could build interactive instruments without writing code. But it takes inputs from anything and sends outputs to anything. So we appropriated it towards our own ends.

You can find everything related to Useless Butler on this GitHub repo.

The thinking behind this workshop is that for us designers to be able to think creatively about applications of machine learning, we need a granular understanding of the nature of the technology. The thing with designers is, we can’t really learn about such things from books. A lot of design knowledge is tacit, it emerges from our physical engagement with the world. This is why things like sketching and prototyping are such essential parts of our way of working. And so with useless butler we aim to create an environment in which you as a designer can gain tacit knowledge about the workings of machine learning.

Simply put, for a lot of us, machine learning is a black box. With Useless Butler, we open the black box a bit and let you peer inside. This should improve the odds of design-led innovation happening in the machine learning space. And it should also help with ethics. But it’s definitely not enough. Knowledge about the technology isn’t the only issue here. There are more black boxes to open.

Values

Which brings me back to that other black box: ethics. Like I already mentioned there is a lot of talk in the tech industry about how we should “be more ethical”. But things are often reduced to this notion that designers should do no harm. As if ethics is a problem to be fixed in stead of a thing to be practiced.

So I started to talk about this to people I know in academia and more than once this thing called Value Sensitive Design was mentioned. It should be no surprise to anyone that scholars have been chewing on this stuff for quite a while. One of the earliest references I came across, an essay by Batya Friedman in Interactions is from 1996! This is a lesson to all of us I think. Pay more attention to what the academics are talking about.

So, at the end of last year I dove into this topic. Our host Iskander Smit, Rob Maijers and myself coordinate a grassroots community for tech workers called Tech Solidarity NL. We want to build technology that serves the needs of the many, not the few. Value Sensitive Design seemed like a good thing to dig into and so we did.

I’m not going to dive into the details here. There’s a report on the Tech Solidarity NL website if you’re interested. But I will highlight a few things that value sensitive design asks us to consider that I think help us unpack what it means to practice ethical design.

First of all, values. Here’s how it is commonly defined in the literature:

“A value refers to what a person or group of people consider important in life.”

I like it because it’s common sense, right? But it also makes clear that there can never be one monolithic definition of what ‘good’ is in all cases. As we designers like to say: “it depends” and when it comes to values things are no different.

“Person or group” implies there can be various stakeholders. Value sensitive design distinguishes between direct and indirect stakeholders. The former have direct contact with the technology, the latter don’t but are affected by it nonetheless. Value sensitive design means taking both into account. So this blows up the conventional notion of a single user to design for.

Various stakeholder groups can have competing values and so to design for them means to arrive at some sort of trade-off between values. This is a crucial point. There is no such thing as a perfect or objectively best solution to ethical conundrums. Not in the design of technology and not anywhere else.

Value sensitive design encourages you to map stakeholders and their values. These will be different for every design project. Another approach is to use lists like the one pictured here as an analytical tool to think about how a design impacts various values.

Furthermore, during your design process you might not only think about the short-term impact of a technology, but also think about how it will affect things in the long run.

And similarly, you might think about the effects of a technology not only when a few people are using it, but also when it becomes wildly successful and everybody uses it.

There are tools out there that can help you think through these things. But so far much of the work in this area is happening on the academic side. I think there is an opportunity for us to create tools and case studies that will help us educate ourselves on this stuff.

There’s a lot more to say on this but I’m going to stop here. The point is, as with the nature of the technologies we work with, it helps to dig deeper into the nature of the relationship between technology and society. Yes, it complicates things. But that is exactly the point.

Privileging simple and scalable solutions over those adapted to local needs is socially, economically and ecologically unsustainable. So I hope you will join me in embracing complexity.

‘Playful Design for Workplace Change Management’ at PLAYTrack conference 2017 in Aarhus

At the end of last year I was invited to speak at the PLAYTrack conference in Aarhus about the workplace change management games made by Hubbub. It turned out to be a great opportunity to reconnect with the play research community.

I was very much impressed by the program assembled by the organisers. People came from a wide range of disciplines and crucially, there was ample time to discuss and reflect on the materials presented. As I tweeted afterwards, this is a thing that most conference organisers get wrong.

Back in Utrecht after a wonderful time in Århus attending #PLAYTrack. The lectures were uniformly fascinating but the one thing this conference really got right was the ample time to reflect and discuss. Really elevates the experience to something more than the usual info dump.
— Kars Alfrink (@kaeru) December 8, 2017

I was particularly inspired by the work of Benjamin Mardell and Mara Krechevsky at Harvard’s Project Zero – Making Learning Visible looks like a great resource for anyone who teaches. Then there was Reed Stevens from Northwestern University whose project FUSE is one of the most solid examples of playful learning for STEAM I’ve seen thus far. I was also fascinated by Ciara Laverty’s work at PEDAL on observing parent-child play. Miguel Sicart delivered another great provocation on the dark side of playful design. And finally I was delighted to hear about and experience for myself some of Amos Blanton’s work at the LEGO Foundation. I should also call out Ben Fincham’s many provocative contributions from the audience.

The abstract for my talk is below, which covers most of what I talked about. I tried to give people a good sense of:

what the games consisted of,
what we were aiming to achieve,
how both the fiction and the player activities supported these goals,
how we made learning outcomes visible to our players and clients,
and finally how we went about designing and developing these games.

Both projects have solid write-ups over at the Hubbub website, so I’ll just point to those here: Code 4 and Ripple Effect.

In the final section of the talk I spent a bit of time reflecting on how I would approach projects like this today. After all, it has been seven years since we made Code 4, and four years since Ripple Effect. That’s ages ago and my perspective has definitely changes since we made these.

Participatory design

First of all, I would get even more serious about co-designing with players at every step. I would recruit representatives of players and invest them with real influence. In the projects we did, the primary vehicle for player influence was through playtesting. But this is necessarily limited. I also won’t pretend this is at all easy to do in a commercial context.

But, these games are ultimately about improving worker productivity. So how do we make it so that workers share in the real-world profits yielded by a successful culture change?

I know of the existence of participatory design but from my experience it is not a common approach in the industry. Why?

Value sensitive design

On a related note, I would get more serious about what values are supported by the system, in whose interest they are and where they come from. Early field research and workshops with audience do surface some values but values from customer representatives tend to dominate. Again, the commercial context we work in is a potential challenge.

I know of value sensitive design, but as with participatory design, it has yet to catch on in a big way in the industry. So again, why is that?

Disintermediation

One thing I continue to be interested in is to reduce the complexity of a game system’s physical affordances (which includes its code), and to push even more of the substance of the game into those social allowances that make up the non-material aspects of the game. This allows for spontaneous renegotiation of the game by the players. This is disintermediation as a strategy. David Kanaga’s take on games as toys remains hugely inspirational in this regard, as does Bernard De Koven’s book The Well Played Game.

Gamefulness versus playfulness

Code 4 had more focus on satisfying the need for autonomy. Ripple Effect had more focus on competence, or in any case, it had less emphasis on autonomy. There was less room for ‘play’ around the core digital game. It seems to me that mastering a subjective simulation of a subject is not necessarily what a workplace game for culture change should be aiming for. So, less gameful design, more playful design.

Adaptation

Finally, the agency model does not enable us to stick around for the long haul. But workplace games might be better suited to a setup where things aren’t thought of as a one-off project but more of an ongoing process.

In How Buildings Learn, Stewart Brand talks about how architects should revisit buildings they’ve designed after they are built to learn about how people are actually using them. He also talks about how good buildings are buildings that its inhabitants can adapt to their needs. What does that look like in the context of a game for workplace culture change?

Playful Design for Workplace Change Management

Code 4 (2011, commissioned by the Tax Administration of the Netherlands) and Ripple Effect (2013, commissioned by Royal Dutch Shell) are both games for workplace change management designed and developed by Hubbub, a boutique playful design agency which operated from Utrecht, The Netherlands and Berlin, Germany between 2009 and 2015. These games are examples of how a goal-oriented serious game can be used to encourage playful appropriation of workplace infrastructure and social norms, resulting in an open-ended and creative exploration of new and innovative ways of working.

Serious game projects are usually commissioned to solve problems. Solving the problem of cultural change in a straightforward manner means viewing games as a way to persuade workers of a desired future state. They typically take videogame form, simulating the desired new way of working as determined by management. To play the game well, players need to master its system and by extension—it is assumed—learning happens.

These games can be be enjoyable experiences and an improvement on previous forms of workplace learning, but in our view they decrease the possibility space of potential workplace cultural change. They diminish worker agency, and they waste the creative and innovative potential of involving them in the invention of an improved workplace culture.

We instead choose to view workplace games as an opportunity to increase the space of possibility. We resist the temptation to bake the desired new way of working into the game’s physical and digital affordances. Instead, we leave how to play well up to the players. Since these games are team-based and collaborative, players need to negotiate their way of working around the game among themselves. In addition, because the games are distributed in time—running over a number of weeks—and are playable at player discretion during the workday, players are given license to appropriate workplace infrastructure and subvert social norms towards in-game ends.

We tried to make learning tangible in various ways. Because the games at the core are web applications to which players log on with individual accounts we were able to collect data on player behaviour. To guarantee privacy, employers did not have direct access to game databases and only received anonymised reports. We took responsibility for player learning by facilitating coaching sessions in which they could safely reflect on their game experiences. Rounding out these efforts, we conducted surveys to gain insight into the player experience from a more qualitative and subjective perspective.

These games offer a model for a reasonably democratic and ethical way of doing game-based workplace change management. However, we would like to see efforts that further democratise their design and development—involving workers at every step. We also worry about how games can be used to create the illusion of worker influence while at the same time software is deployed throughout the workplace to limit their agency.

Our examples may be inspiring but because of these developments we feel we can’t continue this type of work without seriously reconsidering our current processes, technology stacks and business practices—and ultimately whether we should be making games at all.

Prototyping the Useless Butler: Machine Learning for IoT Designers

ThingsCon Amsterdam 2017, photo by nunocruzstreet.com

At ThingsCon Amsterdam 2017, Péter and I ran a second iteration of our machine learning workshop. We improved on our first attempt at TU Delft in a number of ways.

We prepared example code for communicating with Wekinator from a wifi connected Arduino MKR1000 over OSC.
We created a predefined breadboard setup.
We developed three exercises, one for each type of Wekinator output: regression, classification and dynamic time warping.

In contrast to the first version, we had two hours to run through the whole thing, in stead of a day… So we had to cut some corners, and doubled down on walking participants through a number of exercises so that they would come out of it with some readily applicable skills.

We dubbed the workshop ‘prototyping the useless butler’, with thanks to Philip van Allen for the suggestion to frame the exercises around building something non-productive so that the focus was shifted to play and exploration.

All of the code, the circuit diagram and slides are over on GitHub. But I’ll summarise things here.

We spent a very short amount of time introducing machine learning. We used Google’s Teachable Machine as an example and contrasted regular programming with using machine learning algorithms to train models. The point was to provide folks with just enough conceptual scaffolding so that the rest of the workshop would make sense.
We then introduced our ‘toolchain’ which consists of Wekinator, the Arduino MKR1000 module and the OSC protocol. The aim of this toolchain is to allow designers who work in the IoT space to get a feel for the material properties of machine learning through hands-on tinkering. We tried to create a toolchain with as few moving parts as possible, because each additional component would introduce another point of failure which might require debugging. This toolchain would enable designers to either use machine learning to rapidly prototype interactive behaviour with minimal or no programming. It can also be used to prototype products that expose interactive machine learning features to end users. (For a speculative example of one such product, see Bjørn Karmann’s Objectifier.)
Participants were then asked to set up all the required parts on their own workstation. A list can be found on the Useless Butler GitHub page.
We then proceeded to build the circuit. We provided all the components and showed a Fritzing diagram to help people along. The basic idea of this circuit, the eponymous useless butler, was to have a sufficiently rich set of inputs and outputs with which to play, that would suit all three types of Wekinator output. So we settled on a pair of photoresistors or LDRs as inputs and an RGB LED as output.
With the prerequisites installed and the circuit built we were ready to walk through the examples. For regression we mapped the continuous stream of readings from the two LDRs to three outputs, one each for the red, green and blue of the LED. For classification we put the state of both LDRs into one of four categories, each switching the RGB LED to a specific color (cyan, magenta, yellow or white). And finally, for dynamic time warping, we asked Wekinator to recognise one of three gestures and switch the RGB LED to one of three states (red, green or off).

When we reflected on the workshop afterwards, we agreed we now have a proven concept. Participants were able to get the toolchain up and running and could play around with iteratively training and evaluating their model until it behaved as intended.

However, there is still quite a bit of room for improvement. On a practical note, quite a bit of time was taken up by the building of the circuit, which isn’t the point of the workshop. One way of dealing with this is to bring those to a workshop pre-built. Doing so would enable us to get to the machine learning quicker and would open up time and space to also engage with the participants about the point of it all.

We’re keen on bringing this workshop to more settings in future. If we do, I’m sure we’ll find the opportunity to improve on things once more and I will report back here.

Many thanks to Iskander and the rest of the ThingsCon team for inviting us to the conference.

‘Hybrid Writing for Conversational Interfaces’ workshop

On May 24 of this year, Niels ’t Hooft and myself ran a workshop titled ‘Hybrid Writing for Conversational Interfaces’ at TU Delft. Our aim was twofold: teach students about writing characters and dialog, and teach them how to prototype chat interfaces.

We spent a day with roughly thirty industrial design students alternating between bits of theory, writing exercises, instructions on how to use Twine (our prototyping tool of choice) and closed out with a small project and a show and tell.

I was very pleased to see prototypes with quite a high level of complexity and sophistication at the end of the day. And throughout, I could tell students were enjoying themselves writing and building interactive conversations.

Here’s a rough outline of how the workshop was structured.

After briefly introducing ourselves, Niels presented a mini-lecture on interactive fiction. A highlight for me was a two-by-two of the ways in which fiction and software can intersect.

Four types of software-fiction hybrids

I then took over and did a show and tell of the absolute basics of using Twine. Things like creating passages, linking them, creating branches and testing and publishing your story.
The first exercise after this was for students to take what they just learned about Twine and try to create a very simple interactive story.
After a coffee break, Niels then presented his second mini-lecture on the very basics of writing. With a particular focus on writing characters and dialog. This included a handy cheatsheet for things to consider while writing.

A cheatsheet for writing dialog

In our second exercise students worked in pairs. They first each created a character, which they then described to each other. They then first planned out the structure of an encounter between these two characters. And finally they collaboratively wrote the dialogue for this encounter. They were required to stick to Hollywood formatting. Niels and I then did a reading of a few (to great amusement of all present) to close out the morning section of the workshop.
After lunch Niels presented his third and final mini-lecture of the day, on conversational interfaces, relying heavily on the great work of our friend Alper in his book on the subject.
I then took over for the second show and tell. Here we ramped up the challenge and introduced the Twine Texting Project – a framework for prototyping conversational interfaces in Twine. On GitHub, you can find the starter file I had prepared for this section.
The third and final exercise of the day was for students to take what they learned about writing dialog, and prototyping chat interfaces, and to build an interactive prototype of a conversational interface or interactive fiction in chat format. They could either build off of the dialog they have created in the previous exercise, or start from scratch.
We finished the day with demos, where put the Twine story on the big screen and as a group chose what options to select. After each demo the creator would open up the Twine file and walk us through how they had built it. It was pretty cool to see how many students had put what they had learned to very creative uses.

Reflecting on the workshop afterwards, we felt the structure was nicely balanced between theory and practice. The difficulty level was such that students did learn some new things which they could incorporate into future projects, but still built on skills they had already acquired. The choice for Twine worked out well too since it is highly accessible. Non-technical students managed to create something interactive, and more advanced students could apply what they knew about code to produce more sophisticated prototypes.

For future workshops we did feel we could improve on building a bridge between the writing for interactive fiction and writing for conversational interfaces of software products and services. This would require some adaptation of the mini lectures and a slightly different emphasis in the exercises. The key would be to have students imagine existing products and services as characters, and to then write dialog for interactions and prototype them. For a future iteration of the workshop, this would be worth exploring further.

Many thanks to Ianus Keller for inviting us to teach this workshop at IDE Academy.