Designing Learning Experiences in a Post-ChatGPT World

Transcript of a talk delivered at LXDCON’25 on June 12, 2025.

My name is Kars. I am a postdoc at TU Delft. I research contestable AI—how to use design to ensure AI systems remain subject to societal control. I teach the responsible design of AI systems. In a previous life, I was a practicing designer of digital products and services. I will talk about designing learning experiences in a post-ChatGPT world.

Let’s start at this date.

This is when OpenAI released an early demo of ChatGPT. The chatbot quickly went viral on social media. Users shared examples of what it could do. Stories and samples included everything from travel planning to writing fables to coding computer programs. Within five days, the chatbot had attracted over one million users.

Fast forward to today, 2 years, 6 months, and 14 days later, we’ve seen a massive impact across domains, including on education.

For example, the article on the left talks about how AI cheating has become pervasive in higher education. It is fundamentally undermining the educational process itself. Students are using ChatGPT for nearly every assignment while educators struggle with ineffective detection methods and question whether traditional academic work has lost all meaning.

The one on the right talks about how students are accusing professors of being hypocritical. Teachers are using AI tools for things like course materials and grading while telling students they cannot use them.

What we’re looking at is a situation where academic integrity was already in question, on top of that, both students and faculty are quickly adopting AI, and institutions aren’t really ready for it.

These transformations in higher education give me pause. What should we change about how we design learning experiences given this new reality?

So, just to clarify, when I mention “AI” in this talk, I’m specifically referring to generative AI, or GenAI, and even more specifically, to chatbots that are powered by large language models, like ChatGPT.

Throughout this talk I will use this example of a learning experience that makes use of GenAI. Sharad Goel, Professor at Harvard Kennedy School, developed an AI Slackbot named “StatGPT” that aims to enhance student learning through interactive engagement.

It was tested in a statistics course with positive feedback from students. They described it as supportive and easily accessible, available anytime for student use. There are plans to implement StatGPT in various other courses. They say it assists in active problem-solving and consider it an example of how AI can facilitate learning, rather than replace it.

The debate around GenAI and learning has become polarized. I see the challenge as trying to find a balance. On one side, there’s complete skepticism about AI, and on the other, there’s this blind acceptance of it. What I propose is that we need an approach I call Conscious Adaptation: moving forward with full awareness of what’s being transformed.

To build the case for this approach, I will be looking at two common positions in the debates around AI and education. I’ll be focusing on four pieces of writing.

Two of them are by Ethan Mollick, from his blog. He’s a professor at the University of Pennsylvania specializing in innovation and entrepreneurship, known for his work on the potential of AI to transform different fields.

The other two pieces are by Ian Bogost, published at The Atlantic. He’s a media studies scholar, author, and game designer who teaches at Washington University. He’s known for his sobering, realist critiques of the impact of technology on society.

These, to me, exemplify two strands of the debate around AI in education.

Ethan Mollick’s position, in essence, is that AI in education is an inevitable transformation that educators must embrace and redesign around, not fight.

You could say Mollick is an optimist. But he is also really clear-eyed about how much disruption is going on. He even refers to it as the “Homework Apocalypse.” He talks about some serious issues: there are failures in detection, students are not learning as well (with exam performance dropping by about 17%), and there are a lot of misunderstandings about AI on both sides—students and faculty.

But his perspective is more about adapting to a tough situation. He’s always focused on solutions, constantly asking, “What can we do about this?” He believes that with thoughtful human efforts, we can really influence the outcomes positively.

On the other hand, Ian Bogost’s view is that AI has created an unsolvable crisis that’s fundamentally breaking traditional education and leaving teachers demoralized.

Bogost, I would describe as a realist. He accepts the inevitability of AI, noting that the “arms race will continue” and that technology will often outpace official policies. He also highlights the negative impact on faculty morale, the dependency of students, and the chaos in institutions.

He’s not suggesting that we should ban AI or go back to a time before it existed. He sees AI as something that might be the final blow to a profession that’s already struggling with deeper issues. At the same time, he emphasizes the need for human agency by calling out the lack of reflection and action from institutions.

So, they both observe the same reality, but they look at it differently. Mollick sees it as an engineering challenge—one that’s complicated but can be tackled with smart design. On the other hand, Bogost views it as a social issue that uncovers deeper problems that can’t just be fixed with technology.

Mollick thinks it’s possible to rebuild after a sort of collapse, while Bogost questions if the institutions that are supposed to do that rebuilding are really fit for the job.

Mollick would likely celebrate it as an example of co-intelligence. Bogost would likely ask what the rollout of the bot would be at the expense of, or what deeper problems its deployment unveils.

Getting past the conflict between these two views isn’t just about figuring out the best technical methods or the right order of solutions. The real challenge lies in our ability as institutions to make real changes, and we need to be careful that focusing on solutions doesn’t distract us from the important discussions we need to have.

I see three strategies that work together to create an approach that addresses the conflict between these two perspectives in a way that I believe will be more effective.

First, institutional realism is about designing interventions assuming institutions will resist change, capture innovations, or abandon initiatives. Given this, we could focus on individual teacher practices, learner-level tools, and changes that don’t require systemic transformation. We could treat every implementation as a diagnostic probe revealing actual (vs. stated) institutional capacity.

Second, loss-conscious innovation is about before implementing AI-enhanced practices, explicitly identifying what human learning processes, relationships, or skills are being replaced. We could develop metrics that track preservation alongside progress. We could build “conservation” components into new approaches to protect irreplaceable educational values.

Third, and finally, we should recognize that Mollick-style solution-building and Bogost-style critical analysis serve different but essential roles. Practitioners need actionable guidance; while the broader field needs diagnostic consciousness. We should avoid a false synthesis but instead maintain both approaches as distinct intellectual work that informs each other.

In short, striking a balance may not be the main focus; it’s more about taking practical actions while considering the overall context. Progress is important, but it’s also worth reflecting on what gets left behind. Conscious adaptation.

So, applying these strategies to Harvard’s chatbot, we could ask: (1) How can we create a feedback loop between an intervention like this and the things it uncovers about institutional limits, so that those can be addressed in the appropriate place? (2) How can we measure what value this bot adds for students and for teachers? What is it replacing, what is it adding, what is it making room for? (3) What critique of learning at Harvard is implied by this intervention?

What does all of this mean, finally, for LXD? This is an LXD conference, so I don’t need to spend a lot of time explaining what it is. But let’s just use this basic definition as a starting point. It’s about experiences, it’s about centering the learner, it’s about achieving learning outcomes, etc.

Comparing my conscious adaptation approach to what typifies LXD, I can see a number of alignments.

Both LXD and Conscious Adaptation prioritize authentic human engagement over efficiency. LXD through human-centered design, conscious adaptation through protecting meaningful intellectual effort from AI displacement.

LXD’s focus on holistic learning journeys aligns with both Mollick’s “effort is the point” and Bogost’s concern that AI shortcuts undermine the educational value embedded in struggle and synthesis.

LXD’s experimental, prototype-driven approach mirrors my “diagnostic pragmatism”—both treat interventions as learning opportunities that reveal what actually works rather than pursuing idealized solutions.

So, going back one final time to Harvard’s bot, an LXD practice aligned in this way would lead us to ask: (1) Is this leveraging GenAI to protect and promote genuine intellectual effort? (2) Are teachers and learners meaningfully engaged in the ongoing development of this technology? (3) Is this prototype properly embedded, so that its potential to create learning for the organization can be realized?

So, where does this leave us as learning experience designers? I see three practical imperatives for Conscious Adaptation.

First, we need to protect meaningful human effort while leveraging AI’s strengths. Remember that “the effort is the point” in learning. Rather than asking “can AI do this?”, we should ask “should it?” Harvard’s bot works because it scaffolds thinking rather than replacing it. We should use AI for feedback and iteration while preserving human work for synthesis and struggle.

Second, we must design for real institutions, not ideal ones. Institutions resist change, capture innovations, and abandon initiatives. We need to design assuming limited budgets, overworked staff, and competing priorities. Every implementation becomes a diagnostic probe that reveals what resistance actually tells us about institutional capacity.

Third, we have to recognize the limits of design. AI exposes deeper structural problems like grade obsession, teacher burnout, and test-driven curricula. You can’t design your way out of systemic issues, and sometimes the best move is recognizing when the problem isn’t experiential at all.

This is Conscious Adaptation—moving forward with eyes wide open.

Thanks.

On mapping AI value chains

At CSCW 2024, back in November of last year, we* ran a workshop titled “From Stem to Stern: Contestability Along AI Value Chains.” With it, we wanted to address a gap in contestable AI research. Current work focuses mainly on contesting specific AI decisions or outputs (for example, appealing a decision made by an automated content moderation system). But we should also look at contestability across the entire AI value chain—from raw material extraction to deployment and impact (think, for example, of data center activists opposing the construction of new hyperscales). We aimed to explore how different stakeholders can contest AI systems at various points in this chain, considering issues like labor conditions, environmental impact, and data collection practices often overlooked in contestability discussions.

The workshop mixed presentations with hands-on activities. In the morning, researchers shared their work through short talks, both in person and online. The afternoon focused on mapping out where and how people can contest AI systems, from data collection to deployment, followed by detailed discussions of the practical challenges involved. We had both in-person and online participants, requiring careful coordination between facilitators. We wrapped up by synthesizing key insights and outlining future research directions.

I was responsible for being a remote facilitator most of the day. But Mireia and I also prepared and ran the first group activity, in which we mapped a typical AI value chain. I figured I might as well share the canvas we used for that here. It’s not rocket science, but it held up pretty well, so maybe some other people will get some use out of it. The canvas was designed to offer a fair bit of scaffolding for thinking through what decision points there are along the chain that are potentially value-laden.

AI value chain mapping canvas (licensed CC-BY 4.0 Mireia Yurrita & Kars Alfrink, 2024). Download PDF.

Here’s how the activity worked: We covered about 50 minutes doing a structured mapping exercise where participants identified potential contestation points along an AI value chain, using ChatGPT as an example case. The activity used a Miro board with a preliminary map showing different stages of AI development (infrastructure setup, data management, AI development, etc.). Participants first brainstormed individually for 10 minutes, adding value-laden decisions and noting stakeholders, harms, benefits, and values at stake. They then collaborated to reorganize and discuss the map for 15 minutes. The activity concluded with participants using dot voting (3 votes each) to identify the most impactful contestation sites, which were then clustered and named to feed into the next group activity.

The activity design drew from two main influences: typical value chain mapping methodologies (e.g., Mapping Actors along Value Chains, 2017), which usually emphasize tracking actors, flows, and contextual factors, and Wardley mapping (Wardley, 2022), which is characterized by the idea of a structured progression along an x-axis with an additional dimension on the y-axis.

The canvas design aimed to make AI system development more tangible by breaking it into clear phases (from infrastructure through governance) while considering visibility and materiality through the y-axis. We ultimately chose to use a familiar system (ChatGPT). This, combined with the activity’s structured approach, helped participants identify concrete opportunities for intervention and contestation along the AI value chain, which we could build on during the rest of the workshop.

I got a lot out of this workshop. Some of the key takeaways that emerged out of the activities and discussions include:

  • There’s a disconnect between legal and technical communities, from basic terminology differences to varying conceptions of key concepts like explainability, highlighting the need for translation work between disciplines.
  • We need to move beyond individual grievance models to consider collective contestation and upstream interventions in the AI supply chain.
  • We also need to shift from reactive contestation to proactive design approaches that build in contestability from the start.
  • By virtue of being hybrid, we were lucky enough to have participants from across the globe. This helped drive home to me the importance of including Global South perspectives and considering contestability beyond Western legal frameworks. We desperately need a more inclusive and globally-minded approach to AI governance.

Many thanks to all the workshop co-organizers for having me as part of the team and to Agathe and Yulu, in particular, for leading the effort.


* The full workshop team consisted of Agathe Balayn, Yulu Pi, David Gray Widder, Mireia Yurrita, Sohini Upadhyay, Naveena Karusala, Henrietta Lyons, Cagatay Turkay, Christelle Tessono, Blair Attard-Frost, Ujwal Gadiraju, and myself.

Participatory AI and ML engineering

In the first half of this year, I’ve presented several versions of a brief talk on participatory AI. I figured I would post an amalgam of these to the blog for future reference. (Previously, on the blog, I posted a brief lit review on the same topic; this talk builds on that.)

So, to start, the main point of this talk is that many participatory approaches to AI don’t engage deeply with the specifics of the technology. One such specific is the translation work engineers do to make a problem “learnable” by a machine (Kang, 2023). From this perspective, the main question to ask becomes, how does translation happen in our specific projects? Should citizens be involved in this translation work? If so, how to achieve this?

Before we dig into the state of participatory AI, let’s begin by clarifying why we might want to enable participation in the first place. A common motivation is a lack of democratic control over AI systems. (This is particularly concerning when AI systems are used for government policy execution. These are the systems I mostly look at in my own research.) And so the response is to bring the people into the development process, and to let them co-decide matters.

In these cases, participation can be understood as an enabler of democratic agency, i.e., a way for subjects to legitimate the use of AI systems (cf. Peter, 2020 in Rubel et al., 2021). Peter distinguishes two pathways: a normative one and a democratic one. Participation can be seen as an example of the democratic pathway to legitimation. A crucial detail Peter mentions here, which is often overlooked in participatory AI literature, is that normative constraints must limit the democratic pathway to avoid arbitrariness.

So, what is the state of participatory AI research and practice? I will look at each in turn next.

As mentioned, I previously posted on the state of participatory AI research, so I won’t repeat that in full here. (For the record, I reviewed Birhane et al. (2022), Bratteteig & Verne (2018), Delgado et al. (2023), Ehsan & Riedl (2020), Feffer et al. (2023), Gerdes (2022), Groves et al. (2023), Robertson et al. (2023), Sloane et al. (2020), and Zytko et al. (2022).) Elements that jump out include:

  • Superficial and unrepresentative involvement.
  • Piecemeal approaches that have minimal impact on decision-making.
  • Participants with a consultative role rather than that of active decision-makers.
  • A lack of bridge-builders between stakeholder perspectives.
  • Participation washing and exploitative community involvement.
  • Struggles with the dynamic nature of technology over time.
  • Discrepancies between the time scales for users to evaluate design ideas versus the pace at which systems are developed.
  • A demand for participation to enhance community knowledge and to actually empower them.

Taking a step back, if I were to evaluate the state of the scientific literature on participatory AI, it strikes me that many of these issues are not new to AI. They have been present in participatory design more broadly for some time already. Many of these issues are also not necessarily specific to AI. The ones I would call out include the issues related to AI system dynamism, time scales of participation versus development, and knowledge gaps between various actors in participatory processes (and, relatedly, the lack of bridge-builders).

So, what about practice? Let’s look at two reports that I feel are a good representation of the broader field: Framework for Meaningful Stakeholder Involvement by ECNL & SocietyInside, and Democratizing AI: Principles for Meaningful Public Participation by Data & Society.

Framework for Meaningful Stakeholder Involvement is aimed at businesses, organizations, and institutions that use AI. It focuses on human rights, ethical assessment, and compliance. It aims to be a tool for planning, delivering, and evaluating stakeholder engagement effectively, emphasizing three core elements: Shared Purpose, Trustworthy Process, and Visible Impact.

Democratizing AI frames public participation in AI development as a way to add legitimacy and accountability and to help prevent harmful impacts. It outlines risks associated with AI, including biased outcomes, opaque decision-making processes, and designers lacking real-world impact awareness. Causes for ineffective participation include unidirectional communication, socioeconomic barriers, superficial engagement, and ineffective third-party involvement. The report uses environmental law as a reference point and offers eight guidelines for meaningful public participation in AI.

Taking stock of these reports, we can say that the building blocks for the overall process are available to those seriously looking. The challenges facing participatory AI are, on the one hand, economic and political. On the other hand, they are related to the specifics of the technology at hand. For the remainder of this piece, let’s dig into the latter a bit more.

Let’s focus on translation work done by engineers during model development.

For this, I build on work by Kang (2023), which focuses on the qualitative analysis of how phenomena are translated into ML-compatible forms, paying specific attention to the ontological translations that occur in making a problem learnable. Translation in ML means transforming complex qualitative phenomena into quantifiable and computable forms. Multifaceted problems are converted into a “usable quantitative reference” or “ground truth.” This translation is not a mere representation of reality but a reformulation of a problem into mathematical terms, making it understandable and processable by ML algorithms. This transformation involves a significant amount of “ontological dissonance,” as it mediates and often simplifies the complexity of real-world phenomena into a taxonomy or set of classes for ML prediction. The process of translating is based on assumptions and standards that may alter the nature of the ML task and introduce new social and technical problems.

So what? I propose we can use the notion of translation as a frame for ML engineering. Understanding ML model engineering as translation is a potentially useful way to analyze what happens at each step of the process: What gets selected for translation, how the translation is performed, and what the resulting translation consists of.

So, if we seek to make participatory AI engage more with the technical particularities of ML, we could begin by identifying translations that have happened or might happen in our projects. We could then ask to what extent these acts of translation are value-laden. For those that are, we could think about how to communicate these translations to a lay audience. A particular challenge I expect we will be faced with is what the meaningful level of abstraction for citizen participation during AI development is. We should also ask what the appropriate ‘vehicle’ for citizen participation will be. And we should seek to move beyond small-scale, one-off, often unrepresentative forms of direct participation.

Bibliography

  • Birhane, A., Isaac, W., Prabhakaran, V., Diaz, M., Elish, M. C., Gabriel, I., & Mohamed, S. (2022). Power to the People? Opportunities and Challenges for Participatory AI. Equity and Access in Algorithms, Mechanisms, and Optimization, 1–8. https://doi.org/10/grnj99
  • Bratteteig, T., & Verne, G. (2018). Does AI make PD obsolete?: Exploring challenges from artificial intelligence to participatory design. Proceedings of the 15th Participatory Design Conference: Short Papers, Situated Actions, Workshops and Tutorial – Volume 2, 1–5. https://doi.org/10/ghsn84
  • Delgado, F., Yang, S., Madaio, M., & Yang, Q. (2023). The Participatory Turn in AI Design: Theoretical Foundations and the Current State of Practice. Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, 1–23. https://doi.org/10/gs8kvm
  • Ehsan, U., & Riedl, M. O. (2020). Human-Centered Explainable AI: Towards a Reflective Sociotechnical Approach. In C. Stephanidis, M. Kurosu, H. Degen, & L. Reinerman-Jones (Eds.), HCI International 2020—Late Breaking Papers: Multimodality and Intelligence (pp. 449–466). Springer International Publishing. https://doi.org/10/gskmgf
  • Feffer, M., Skirpan, M., Lipton, Z., & Heidari, H. (2023). From Preference Elicitation to Participatory ML: A Critical Survey & Guidelines for Future Research. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 38–48. https://doi.org/10/gs8kvx
  • Gerdes, A. (2022). A participatory data-centric approach to AI Ethics by Design. Applied Artificial Intelligence, 36(1), 2009222. https://doi.org/10/gs8kt4
  • Groves, L., Peppin, A., Strait, A., & Brennan, J. (2023). Going public: The role of public participation approaches in commercial AI labs. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 1162–1173. https://doi.org/10/gs8kvs
  • Kang, E. B. (2023). Ground truth tracings (GTT): On the epistemic limits of machine learning. Big Data & Society, 10(1), 1–12. https://doi.org/10/gtfgvx
  • Peter, F. (2020). The Grounds of Political Legitimacy. Journal of the American Philosophical Association, 6(3), 372–390. https://doi.org/10/grqfhn
  • Robertson, S., Nguyen, T., Hu, C., Albiston, C., Nikzad, A., & Salehi, N. (2023). Expressiveness, Cost, and Collectivism: How the Design of Preference Languages Shapes Participation in Algorithmic Decision-Making. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 1–16. https://doi.org/10/gr6q2t
  • Rubel, A., Castro, C., & Pham, A. K. (2021). Algorithms and autonomy: The ethics of automated decision systems. Cambridge University Press.
  • Sloane, M., Moss, E., Awomolo, O., & Forlano, L. (2020). Participation is not a Design Fix for Machine Learning. arXiv:2007.02423 [Cs]. http://arxiv.org/abs/2007.02423
  • Zytko, D., J. Wisniewski, P., Guha, S., P. S. Baumer, E., & Lee, M. K. (2022). Participatory Design of AI Systems: Opportunities and Challenges Across Diverse Users, Relationships, and Application Domains. Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, 1–4. https://doi.org/10/gs8kv6

Democratizing AI Through Continuous Adaptability: The Role of DevOps

Below are the abstract and slides for my contribution to the TILTing Perspectives 2024 panel “The mutual shaping of democratic practices & AI,” moderated by Merel Noorman.

Slides

Abstract

Contestability

This presentation delves into democratizing artificial intelligence (AI) systems through contestability. Contestability refers to the ability of AI systems to remain open and responsive to disputes throughout their lifecycle. It approaches AI systems as arenas where groups compete for power over designs and outcomes.

Autonomy, democratic agency, legitimation

We identify contestability as a critical system quality for respecting people’s autonomy. This includes their democratic agency: their ability to legitimate policies. This includes policies enacted by AI systems.

For a decision to be legitimate, it must be democratically willed or rely on “normative authority.” The democratic pathway should be constrained by normative bounds to avoid arbitrariness. The appeal to authority should meet the “access constraint,” which ensures citizens can form beliefs about policies with a sufficient degree of agency (Peter, 2020 in Rubel et al., 2021).

Contestability is the quality that ensures mechanisms are in place for subjects to exercise their democratic agency. In the case of an appeal to normative authority, contestability mechanisms are how subjects and their representatives gain access to the information that will enable them to evaluate its justifiability. In this way, contestability satisfies the access constraint. In the case of democratic will, contestability-by-design practices are how system development is democratized. The autonomy account of legitimation adds the normative constraints that should bind this democratic pathway.

Himmelreich (2022) similarly argues that only a “thick” conception of democracy will address some of the current shortcomings of AI development. This is a pathway that not only allows for participation but also includes deliberation over justifications.

The agonistic arena

Elsewhere, we have proposed the Agonistic Arena as a metaphor for thinking about the democratization of AI systems (Alfrink et al., 2024). Contestable AI embodies the generative metaphor of the Arena. This metaphor characterizes public AI as a space where interlocutors embrace conflict as productive. Seen through the lens of the Arena, public AI problems stem from a need for opportunities for adversarial interaction between stakeholders.

This metaphorical framing suggests prescriptions to make more contentious and open to dispute the norms and procedures that shape:

  1. AI system design decisions on a global level, and
  2. human-AI system output decisions on a local level (i.e., individual decision outcomes), establishing new dialogical feedback loops between stakeholders that ensure continuous monitoring.

The Arena metaphor encourages a design ethos of revisability and reversibility so that AI systems embody the agonistic ideal of contingency.

Post-deployment malleability, feedback-ladenness

Unlike physical systems, AI technologies exhibit a unique malleability post-deployment.

For example, LLM chatbots optimize their performance based on a variety of feedback sources, including interactions with users, as well as feedback collected through crowd-sourced data work.

Because of this open-endedness, democratic control and oversight in the operations phase of the system’s lifecycle become a particular concern.

This is a concern because while AI systems are dynamic and feedback-laden (Gilbert et al., 2023), many of the existing oversight and control measures are static, one-off exercises that struggle to track systems as they evolve over time.

DevOps

The field of DevOps is pivotal in this context. DevOps focuses on system instrumentation for enhanced monitoring and control for continuous improvement. Typically, metrics for DevOps and their machine learning-specific MLOps offshoot emphasize technical performance and business objectives.

However, there is scope to expand these to include matters of public concern. The matters-of-concern perspective shifts the focus on issues such as fairness or discrimination, viewing them as challenges that cannot be resolved through universal methods with absolute certainty. Rather, it highlights how standards are locally negotiated within specific institutional contexts, emphasizing that such standards are never guaranteed (Lampland & Star, 2009, Geiger et al., 2023).

MLOps Metrics

In the context of machine learning systems, technical metrics focus on model accuracy. For example, a financial services company might use Area Under The Curve Receiver Operating Characteristics (AUC-ROC) to continuously monitor and maintain the performance of their fraud detection model in production.

Business metrics focus on cost-benefit analyses. For example, a bank might use a cost-benefit matrix to balance the potential revenue from approving a loan against the risk of default, ensuring that the overall profitability of their loan portfolio is optimized.

Drift

These metrics can be monitored over time to detect “drift” between a model and the world. Training sets are static. Reality is dynamic. It changes over time. Drift occurs when the nature of new input data diverges from the data a model was trained on. A change in performance metrics may be used to alert system operators, who can then investigate and decide on a course of action, e.g., retraining a model on updated data. This, in effect, creates a feedback loop between the system in use and its ongoing development.

An expansion of these practices in the interest of contestability would require:

  1. setting different metrics,
  2. exposing these metrics to additional audiences, and
  3. establishing feedback loops with the processes that govern models and the systems they are embedded in.

Example 1: Camera Cars

Let’s say a city government uses a camera-equipped vehicle and a computer vision model to detect potholes in public roads. In addition to accuracy and a favorable cost-benefit ratio, citizens, and road users in particular, may care about the time between a detected pothole and its fixing. Or, they may care about the distribution of potholes across the city. Furthermore, when road maintenance appears to be degrading, this should be taken up with department leadership, the responsible alderperson, and council members.

Example 2: EV Charching

Or, let’s say the same city government uses an algorithmic system to optimize public electric vehicle (EV) charging stations for green energy use by adapting charging speeds to expected sun and wind. EV drivers may want to know how much energy has been shifted to greener time windows and its trends. Without such visibility on a system’s actual goal achievement, citizens’ ability to legitimate its use suffers. As I have already mentioned, democratic agency, when enacted via the appeal to authority, depends on access to “normative facts” that underpin policies. And finally, professed system functionality must be demonstrated as well (Raji et al., 2022).

DevOps as sociotechnical leverage point for democratizing AI

These brief examples show that the DevOps approach is a potential sociotechnical leverage point. It offers pathways for democratizing AI system design, development, and operations.

DevOps can be adapted to further contestability. It creates new channels between human and machine actors. One of DevOps’s essential activities is monitoring (Smith, 2020), which presupposes fallibility, a necessary precondition for contestability. Finally, it requires and provides infrastructure for technical flexibility so that recovery from error is low-cost and continuous improvement becomes practically feasible.

The mutual shaping of democratic practices & AI

Zooming out further, let’s reflect on this panel’s overall theme, picking out three elements: legitimation, representation of marginalized groups, and dealing with conflict and contestation after implementation and during use.

Contestability is a lever for demanding justifications from operators, which is a necessary input for legitimation by subjects (Henin & Le Métayer, 2022). Contestability frames different actors’ stances as adversarial positions on a political field rather than “equally valid” perspectives (Scott, 2023). And finally, relations, monitoring, and revisability are all ways to give voice to and enable responsiveness to contestations (Genus & Stirling, 2018).

And again, all of these things can be furthered in the post-deployment phase by adapting the DevOps lens.

Bibliography

  • Alfrink, K., Keller, I., Kortuem, G., & Doorn, N. (2022). Contestable AI by Design: Towards a Framework. Minds and Machines33(4), 613–639. https://doi.org/10/gqnjcs
  • Alfrink, K., Keller, I., Yurrita Semperena, M., Bulygin, D., Kortuem, G., & Doorn, N. (2024). Envisioning Contestability Loops: Evaluating the Agonistic Arena as a Generative Metaphor for Public AI. She Ji: The Journal of Design, Economics, and Innovation10(1), 53–93. https://doi.org/10/gtzwft
  • Geiger, R. S., Tandon, U., Gakhokidze, A., Song, L., & Irani, L. (2023). Making Algorithms Public: Reimagining Auditing From Matters of Fact to Matters of Concern. International Journal of Communication18(0), Article 0.
  • Genus, A., & Stirling, A. (2018). Collingridge and the dilemma of control: Towards responsible and accountable innovation. Research Policy47(1), 61–69. https://doi.org/10/gcs7sn
  • Gilbert, T. K., Lambert, N., Dean, S., Zick, T., Snoswell, A., & Mehta, S. (2023). Reward Reports for Reinforcement Learning. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 84–130. https://doi.org/10/gs9cnh
  • Henin, C., & Le Métayer, D. (2022). Beyond explainability: Justifiability and contestability of algorithmic decision systems. AI & SOCIETY37(4), 1397–1410. https://doi.org/10/gmg8pf
  • Himmelreich, J. (2022). Against “Democratizing AI.” AI & SOCIETYhttps://doi.org/10/gr95d5
  • Lampland, M., & Star, S. L. (Eds.). (2008). Standards and Their Stories: How Quantifying, Classifying, and Formalizing Practices Shape Everyday Life (1st edition). Cornell University Press.
  • Peter, F. (2020). The Grounds of Political Legitimacy. Journal of the American Philosophical Association6(3), 372–390. https://doi.org/10/grqfhn
  • Raji, I. D., Kumar, I. E., Horowitz, A., & Selbst, A. (2022). The Fallacy of AI Functionality. 2022 ACM Conference on Fairness, Accountability, and Transparency, 959–972. https://doi.org/10/gqfvf5
  • Rubel, A., Castro, C., & Pham, A. K. (2021). Algorithms and autonomy: The ethics of automated decision systems. Cambridge University Press.
  • Scott, D. (2023). Diversifying the Deliberative Turn: Toward an Agonistic RRI. Science, Technology, & Human Values48(2), 295–318. https://doi.org/10/gpk2pr
  • Smith, J. D. (2020). Operations anti-patterns, DevOps solutions. Manning Publications.
  • Treveil, M. (2020). Introducing MLOps: How to scale machine learning in the enterprise (First edition). O’Reilly.

AI pedagogy through a design lens

At a TU Delft spring symposium on AI education, Hosana and I ran a short workshop titled “AI pedagogy through a design lens.” In it, we identified some of the challenges facing AI teaching, particularly outside of computer science, and explored how design pedagogy, particularly the practices of studios and making, may help to address them. The AI & Society master elective I’ve been developing and teaching over the past five years served as a case study. The session was punctuated by brief brainstorming using an adapted version of the SQUID gamestorming technique. Below are the slides we used.

Participatory AI literature review

I’ve been thinking alot about civic participation in machine learning systems development. In particular, involving non-experts in the potentially value-laden translation work from specifications that engineers do when they build their models. Below is a summary of a selection of literature I found on the topic, which may serve as a jumping-off point for future research.

Abstract

The literature on participatory artificial intelligence (AI) reveals a complex landscape marked by challenges and evolving methodologies. Feffer et al. (2023) critique the reduction of participation to computational mechanisms that only approximate narrow moral values. They also note that engagements with stakeholders are often superficial and unrepresentative. Groves et al. (2023) identify significant barriers in commercial AI labs, including high costs, fragmented approaches, exploitation concerns, lack of transparency, and contextual complexities. These barriers lead to a piecemeal approach to participation with minimal impact on decision-making in AI labs. Delgado et al. (2023) observe that participatory AI involves stakeholders mostly in a consultative role without integrating them as active decision-makers throughout the AI design lifecycle.

Gerdes (2022) proposes a data-centric approach to AI ethics and underscores the need for interdisciplinary bridge builders to reconcile different stakeholder perspectives. Robertson et al. (2023) explore participatory algorithm design, emphasizing the need for preference languages that balance expressiveness, cost, and collectivism—Sloane et al. (2020) caution against “participation washing” and the potential for exploitative community involvement. Bratteteig & Verne (2018) highlight AI’s challenges to traditional participatory design (PD) methods, including unpredictable technological changes and a lack of user-oriented evaluation. Birhane et al. (2022) call for a clearer understanding of meaningful participation, advocating for a shift towards vibrant, continuous engagement that enhances community knowledge and empowerment. The literature suggests a pressing need for more effective, inclusive, and empowering participatory approaches in AI development.

Bibliography

  1. Birhane, A., Isaac, W., Prabhakaran, V., Diaz, M., Elish, M. C., Gabriel, I., & Mohamed, S. (2022). Power to the People? Opportunities and Challenges for Participatory AI. Equity and Access in Algorithms, Mechanisms, and Optimization, 1–8. https://doi.org/10/grnj99
  2. Bratteteig, T., & Verne, G. (2018). Does AI make PD obsolete?: Exploring challenges from artificial intelligence to participatory design. Proceedings of the 15th Participatory Design Conference: Short Papers, Situated Actions, Workshops and Tutorial – Volume 2, 1–5. https://doi.org/10/ghsn84
  3. Delgado, F., Yang, S., Madaio, M., & Yang, Q. (2023). The Participatory Turn in AI Design: Theoretical Foundations and the Current State of Practice. Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, 1–23. https://doi.org/10/gs8kvm
  4. Ehsan, U., & Riedl, M. O. (2020). Human-Centered Explainable AI: Towards a Reflective Sociotechnical Approach. In C. Stephanidis, M. Kurosu, H. Degen, & L. Reinerman-Jones (Eds.), HCI International 2020—Late Breaking Papers: Multimodality and Intelligence (pp. 449–466). Springer International Publishing. https://doi.org/10/gskmgf
  5. Feffer, M., Skirpan, M., Lipton, Z., & Heidari, H. (2023). From Preference Elicitation to Participatory ML: A Critical Survey & Guidelines for Future Research. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 38–48. https://doi.org/10/gs8kvx
  6. Gerdes, A. (2022). A participatory data-centric approach to AI Ethics by Design. Applied Artificial Intelligence, 36(1), 2009222. https://doi.org/10/gs8kt4
  7. Groves, L., Peppin, A., Strait, A., & Brennan, J. (2023). Going public: The role of public participation approaches in commercial AI labs. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 1162–1173. https://doi.org/10/gs8kvs
  8. Robertson, S., Nguyen, T., Hu, C., Albiston, C., Nikzad, A., & Salehi, N. (2023). Expressiveness, Cost, and Collectivism: How the Design of Preference Languages Shapes Participation in Algorithmic Decision-Making. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 1–16. https://doi.org/10/gr6q2t
  9. Sloane, M., Moss, E., Awomolo, O., & Forlano, L. (2020). Participation is not a Design Fix for Machine Learning. arXiv:2007.02423 [Cs]. http://arxiv.org/abs/2007.02423
  10. Zytko, D., J. Wisniewski, P., Guha, S., P. S. Baumer, E., & Lee, M. K. (2022). Participatory Design of AI Systems: Opportunities and Challenges Across Diverse Users, Relationships, and Application Domains. Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, 1–4. https://doi.org/10/gs8kv6

PhD update – January 2022

It has been three years since I last wrote an update on my PhD. I guess another post is in order.

My PhD plan was formally green-lit in October 2019. I am now over three years into this thing. There are roughly two more years left on the clock. I update my plans on a rolling basis. By my latest estimation, I should be ready to request a date for my defense in May 2023.

Of course, the pandemic forced me to adjust course. I am lucky enough not to be locked into particular methods or cases that are fundamentally incompatible with our current predicament. But still, I had to change up my methods, and reconsider the sequencing of my planned studies.

The conference paper I mentioned in the previous update, using the MX3D bridge to explore smart cities’ logic of control and cityness, was rejected by DIS. I performed a rewrite, but then came to the conclusion it was kind of a false start. These kinds of things are all in the game, of course.

The second paper I wrote uses the Transparent Charging Station to investigate how notions of transparent AI differ between experts and citizens. It was finally accepted late last year and should see publication in AI & Society soon. It is titled Tensions in Transparent Urban AI: Designing A Smart Electric Vehicle Charge Point. This piece went through multiple major revisions and was previously rejected by DIS and CHI.

A third paper, Contestable AI by Design: Towards A Framework, uses a systematic literature review of AI contestability to construct a preliminary design framework, is currently under review at a major philosophy of technology journal. Fingers crossed.

And currently, I am working on my fourth publication, tangentially titled Contestable Camera Cars: A Speculative Design Exploration of Public AI Systems Responsive to Value Change, which will be based on empirical work that uses speculative design as a way to develop guidelines and examples for the aforementioned design framework, and to investigate civil servants’ views on the pathways towards contestable AI systems in public administration.

Once that one is done, I intend to do one more study, probably looking into monitoring and traceability as potential leverage points for contestability, after which I will turn my attention to completing my thesis.

Aside from my research, in 2021 was allowed to develop and teach a master elective centered around my PhD topic, titled AI & Society. In it, students are equipped with technical knowledge of AI, and tools for thinking about AI ethics. They apply these to a design studio project focused on conceptualizing a responsible AI-enabled service that addresses a social issue the city of Amsterdam might conceivably struggle with. Students also write a brief paper reflecting on and critiquing their group design work. You can see me on Vimeo do a brief video introduction for students who are considering the course. I will be running the course again this year starting end of February.

I also mentored a number of brilliant master graduation students: Xueyao Wang (with Jacky Bourgeois as chair) Jooyoung Park, Loes Sloetjes (both with Roy Bendor as chair) and currently Fabian Geiser (with Euiyoung Kim as chair). Working with students is one of the best parts of being in academia.

All of the above would not have been possible without the great support from my supervisory team: Ianus Keller, Neelke Doorn and Gerd Kortuem. I should also give special mention to Thijs Turel at AMS Institute’s Responsible Sensing Lab, where most of my empirical work is situated.

If you want to dig a little deeper into some of this, I recently set up a website for my PhD project over at contestable.ai.

Design and machine learning – an annotated reading list

Earlier this year I coached Design for Interaction master students at Delft University of Technology in the course Research Methodology. The students organised three seminars for which I provided the claims and assigned reading. In the seminars they argued about my claims using the Toulmin Model of Argumentation. The readings served as sources for backing and evidence.

The claims and readings were all related to my nascent research project about machine learning. We delved into both designing for machine learning, and using machine learning as a design tool.

Below are the readings I assigned, with some notes on each, which should help you decide if you want to dive into them yourself.

Hebron, Patrick. 2016. Machine Learning for Designers. Sebastopol: O’Reilly.

The only non-academic piece in this list. This served the purpose of getting all students on the same page with regards to what machine learning is, its applications of machine learning in interaction design, and common challenges encountered. I still can’t think of any other single resource that is as good a starting point for the subject as this one.

Fiebrink, Rebecca. 2016. “Machine Learning as Meta-Instrument: Human-Machine Partnerships Shaping Expressive Instrumental Creation.” In Musical Instruments in the 21st Century, 14:137–51. Singapore: Springer Singapore. doi:10.1007/978–981–10–2951–6_10.

Fiebrink’s Wekinator is groundbreaking, fun and inspiring so I had to include some of her writing in this list. This is mostly of interest for those looking into the use of machine learning for design and other creative and artistic endeavours. An important idea explored here is that tools that make use of (interactive, supervised) machine learning can be thought of as instruments. Using such a tool is like playing or performing, exploring a possibility space, engaging in a dialogue with the tool. For a tool to feel like an instrument requires a tight action-feedback loop.

Dove, Graham, Kim Halskov, Jodi Forlizzi, and John Zimmerman. 2017. UX Design Innovation: Challenges for Working with Machine Learning as a Design Material. The 2017 CHI Conference. New York, New York, USA: ACM. doi:10.1145/3025453.3025739.

A really good survey of how designers currently deal with machine learning. Key takeaways include that in most cases, the application of machine learning is still engineering-led as opposed to design-led, which hampers the creation of non-obvious machine learning applications. It also makes it hard for designers to consider ethical implications of design choices. A key reason for this is that at the moment, prototyping with machine learning is prohibitively cumbersome.

Fiebrink, Rebecca, Perry R Cook, and Dan Trueman. 2011. “Human Model Evaluation in Interactive Supervised Learning.” In, 147. New York, New York, USA: ACM Press. doi:10.1145/1978942.1978965.

The second Fiebrink piece in this list, which is more of a deep dive into how people use Wekinator. As with the chapter listed above this is required reading for those working on design tools which make use of interactive machine learning. An important finding here is that users of intelligent design tools might have very different criteria for evaluating the ‘correctness’ of a trained model than engineers do. Such criteria are likely subjective and evaluation requires first-hand use of the model in real time.

Bostrom, Nick, and Eliezer Yudkowsky. 2014. “The Ethics of Artificial Intelligence.” In The Cambridge Handbook of Artificial Intelligence, edited by Keith Frankish and William M Ramsey, 316–34. Cambridge: Cambridge University Press. doi:10.1017/CBO9781139046855.020.

Bostrom is known for his somewhat crazy but thoughtprovoking book on superintelligence and although a large part of this chapter is about the ethics of general artificial intelligence (which at the very least is still a way out), the first section discusses the ethics of current “narrow” artificial intelligence. It makes for a good checklist of things designers should keep in mind when they create new applications of machine learning. Key insight: when a machine learning system takes on work with social dimensions—tasks previously performed by humans—the system inherits its social requirements.

Yang, Qian, John Zimmerman, Aaron Steinfeld, and Anthony Tomasic. 2016. Planning Adaptive Mobile Experiences When Wireframing. The 2016 ACM Conference. New York, New York, USA: ACM. doi:10.1145/2901790.2901858.

Finally, a feet-in-the-mud exploration of what it actually means to design for machine learning with the tools most commonly used by designers today: drawings and diagrams of various sorts. In this case the focus is on using machine learning to make an interface adaptive. It includes an interesting discussion of how to balance the use of implicit and explicit user inputs for adaptation, and how to deal with inference errors. Once again the limitations of current sketching and prototyping tools is mentioned, and related to the need for designers to develop tacit knowledge about machine learning. Such tacit knowledge will only be gained when designers can work with machine learning in a hands-on manner.

Supplemental material

Floyd, Christiane. 1984. “A Systematic Look at Prototyping.” In Approaches to Prototyping, 1–18. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978–3–642–69796–8_1.

I provided this to students so that they get some additional grounding in the various kinds of prototyping that are out there. It helps to prevent reductive notions of prototyping, and it makes for a nice complement to Buxton’s work on sketching.

Blevis, E, Y Lim, and E Stolterman. 2006. “Regarding Software as a Material of Design.”

Some of the papers refer to machine learning as a “design material” and this paper helps to understand what that idea means. Software is a material without qualities (it is extremely malleable, it can simulate nearly anything). Yet, it helps to consider it as a physical material in the metaphorical sense because we can then apply ways of design thinking and doing to software programming.

Design × AI coffee meetup

If you work in the field of design or artificial intelligence and are interested in exploring the opportunities at their intersection, consider yourself invited to an informal coffee meetup on February 15, 10am at Brix in Amsterdam.

Erik van der Pluijm and myself have for a while now been carrying on a conversation about AI and design and we felt it was time to expand the circle a bit. We are very curious who else out there shares our excitement.

Questions we are mulling over include: How does the design process change when creating intelligent products? And: How can teams collaborate with intelligent design tools to solve problems in new and interesting ways?

Anyway, lots to chew on.

No need to sign up or anything, just show up and we’ll see what happens.

High-skill robots, low-skill workers

Some notes on what I think I understand about technology and inequality.

Let’s start with an obvious big question: is technology destroying jobs faster than they can be replaced? On the long term the evidence isn’t strong. Humans always appear to invent new things to do. There is no reason this time around should be any different.

But in the short term technology has contributed to an evaporation of mid-skilled jobs. Parts of these jobs are automated entirely, parts can be done by fewer people because of higher productivity gained from tech.

While productivity continues to grow, jobs are lagging behind. The year 2000 appears to have been a turning point. “Something” happened around that time. But no-one knows exactly what.

My hunch is that we’ve seen an emergence of a new class of pseudo-monopolies. Oligopolies. And this is compounded by a ‘winner takes all’ dynamic that technology seems to produce.

Others have pointed to globalisation but although this might be a contributing factor, the evidence does not support the idea that it is the major cause.

So what are we left with?

Historically, looking at previous technological upsets, it appears education makes a big difference. People negatively affected by technological progress should have access to good education so that they have options. In the US the access to high quality education is not equally divided.

Apparently family income is associated with educational achievement. So if your family is rich, you are more likely to become a high skilled individual. And high skilled individuals are privileged by the tech economy.

And if Piketty’s is right, we are approaching a reality in which money made from wealth rises faster than wages. So there is a feedback loop in place which only exacerbates the situation.

One more bullet: If you think trickle-down economics, increasing the size of the pie will help, you might be mistaken. It appears social mobility is helped more by decreasing inequality in the distribution of income growth.

So some preliminary conclusions: a progressive tax on wealth won’t solve the issue. The education system will require reform, too.

I think this is the central irony of the whole situation: we are working hard to teach machines how to learn. But we are neglecting to improve how people learn.