Democratizing AI Through Continuous Adaptability: The Role of DevOps

Below are the abstract and slides for my con­tri­bu­tion to the TILT­ing Per­spec­tives 2024 pan­el “The mutu­al shap­ing of demo­c­ra­t­ic prac­tices & AI,” mod­er­at­ed by Mer­el Noorman.

Slides

Abstract

Contestability

This pre­sen­ta­tion delves into democ­ra­tiz­ing arti­fi­cial intel­li­gence (AI) sys­tems through con­testa­bil­i­ty. Con­testa­bil­i­ty refers to the abil­i­ty of AI sys­tems to remain open and respon­sive to dis­putes through­out their life­cy­cle. It approach­es AI sys­tems as are­nas where groups com­pete for pow­er over designs and outcomes. 

Autonomy, democratic agency, legitimation

We iden­ti­fy con­testa­bil­i­ty as a crit­i­cal sys­tem qual­i­ty for respect­ing peo­ple’s auton­o­my. This includes their demo­c­ra­t­ic agency: their abil­i­ty to legit­i­mate poli­cies. This includes poli­cies enact­ed by AI systems. 

For a deci­sion to be legit­i­mate, it must be demo­c­ra­t­i­cal­ly willed or rely on “nor­ma­tive author­i­ty.” The demo­c­ra­t­ic path­way should be con­strained by nor­ma­tive bounds to avoid arbi­trari­ness. The appeal to author­i­ty should meet the “access con­straint,” which ensures cit­i­zens can form beliefs about poli­cies with a suf­fi­cient degree of agency (Peter, 2020 in Rubel et al., 2021).

Con­testa­bil­i­ty is the qual­i­ty that ensures mech­a­nisms are in place for sub­jects to exer­cise their demo­c­ra­t­ic agency. In the case of an appeal to nor­ma­tive author­i­ty, con­testa­bil­i­ty mech­a­nisms are how sub­jects and their rep­re­sen­ta­tives gain access to the infor­ma­tion that will enable them to eval­u­ate its jus­ti­fi­a­bil­i­ty. In this way, con­testa­bil­i­ty sat­is­fies the access con­straint. In the case of demo­c­ra­t­ic will, con­testa­bil­i­ty-by-design prac­tices are how sys­tem devel­op­ment is democ­ra­tized. The auton­o­my account of legit­i­ma­tion adds the nor­ma­tive con­straints that should bind this demo­c­ra­t­ic pathway.

Him­mel­re­ich (2022) sim­i­lar­ly argues that only a “thick” con­cep­tion of democ­ra­cy will address some of the cur­rent short­com­ings of AI devel­op­ment. This is a path­way that not only allows for par­tic­i­pa­tion but also includes delib­er­a­tion over justifications.

The agonistic arena

Else­where, we have pro­posed the Ago­nis­tic Are­na as a metaphor for think­ing about the democ­ra­ti­za­tion of AI sys­tems (Alfrink et al., 2024). Con­testable AI embod­ies the gen­er­a­tive metaphor of the Are­na. This metaphor char­ac­ter­izes pub­lic AI as a space where inter­locu­tors embrace con­flict as pro­duc­tive. Seen through the lens of the Are­na, pub­lic AI prob­lems stem from a need for oppor­tu­ni­ties for adver­sar­i­al inter­ac­tion between stakeholders. 

This metaphor­i­cal fram­ing sug­gests pre­scrip­tions to make more con­tentious and open to dis­pute the norms and pro­ce­dures that shape:

  1. AI sys­tem design deci­sions on a glob­al lev­el, and
  2. human-AI sys­tem out­put deci­sions on a local lev­el (i.e., indi­vid­ual deci­sion out­comes), estab­lish­ing new dia­log­i­cal feed­back loops between stake­hold­ers that ensure con­tin­u­ous monitoring.

The Are­na metaphor encour­ages a design ethos of revis­abil­i­ty and reversibil­i­ty so that AI sys­tems embody the ago­nis­tic ide­al of contingency.

Post-deployment malleability, feedback-ladenness

Unlike phys­i­cal sys­tems, AI tech­nolo­gies exhib­it a unique mal­leabil­i­ty post-deployment. 

For exam­ple, LLM chat­bots opti­mize their per­for­mance based on a vari­ety of feed­back sources, includ­ing inter­ac­tions with users, as well as feed­back col­lect­ed through crowd-sourced data work.

Because of this open-end­ed­ness, demo­c­ra­t­ic con­trol and over­sight in the oper­a­tions phase of the sys­tem’s life­cy­cle become a par­tic­u­lar concern.

This is a con­cern because while AI sys­tems are dynam­ic and feed­back-laden (Gilbert et al., 2023), many of the exist­ing over­sight and con­trol mea­sures are sta­t­ic, one-off exer­cis­es that strug­gle to track sys­tems as they evolve over time.

DevOps

The field of DevOps is piv­otal in this con­text. DevOps focus­es on sys­tem instru­men­ta­tion for enhanced mon­i­tor­ing and con­trol for con­tin­u­ous improve­ment. Typ­i­cal­ly, met­rics for DevOps and their machine learn­ing-spe­cif­ic MLOps off­shoot empha­size tech­ni­cal per­for­mance and busi­ness objectives.

How­ev­er, there is scope to expand these to include mat­ters of pub­lic con­cern. The mat­ters-of-con­cern per­spec­tive shifts the focus on issues such as fair­ness or dis­crim­i­na­tion, view­ing them as chal­lenges that can­not be resolved through uni­ver­sal meth­ods with absolute cer­tain­ty. Rather, it high­lights how stan­dards are local­ly nego­ti­at­ed with­in spe­cif­ic insti­tu­tion­al con­texts, empha­siz­ing that such stan­dards are nev­er guar­an­teed (Lam­p­land & Star, 2009, Geiger et al., 2023).

MLOps Metrics

In the con­text of machine learn­ing sys­tems, tech­ni­cal met­rics focus on mod­el accu­ra­cy. For exam­ple, a finan­cial ser­vices com­pa­ny might use Area Under The Curve Receiv­er Oper­at­ing Char­ac­ter­is­tics (AUC-ROC) to con­tin­u­ous­ly mon­i­tor and main­tain the per­for­mance of their fraud detec­tion mod­el in production.

Busi­ness met­rics focus on cost-ben­e­fit analy­ses. For exam­ple, a bank might use a cost-ben­e­fit matrix to bal­ance the poten­tial rev­enue from approv­ing a loan against the risk of default, ensur­ing that the over­all prof­itabil­i­ty of their loan port­fo­lio is optimized.

Drift

These met­rics can be mon­i­tored over time to detect “drift” between a mod­el and the world. Train­ing sets are sta­t­ic. Real­i­ty is dynam­ic. It changes over time. Drift occurs when the nature of new input data diverges from the data a mod­el was trained on. A change in per­for­mance met­rics may be used to alert sys­tem oper­a­tors, who can then inves­ti­gate and decide on a course of action, e.g., retrain­ing a mod­el on updat­ed data. This, in effect, cre­ates a feed­back loop between the sys­tem in use and its ongo­ing development.

An expan­sion of these prac­tices in the inter­est of con­testa­bil­i­ty would require:

  1. set­ting dif­fer­ent metrics,
  2. expos­ing these met­rics to addi­tion­al audi­ences, and
  3. estab­lish­ing feed­back loops with the process­es that gov­ern mod­els and the sys­tems they are embed­ded in.

Example 1: Camera Cars

Let’s say a city gov­ern­ment uses a cam­era-equipped vehi­cle and a com­put­er vision mod­el to detect pot­holes in pub­lic roads. In addi­tion to accu­ra­cy and a favor­able cost-ben­e­fit ratio, cit­i­zens, and road users in par­tic­u­lar, may care about the time between a detect­ed pot­hole and its fix­ing. Or, they may care about the dis­tri­b­u­tion of pot­holes across the city. Fur­ther­more, when road main­te­nance appears to be degrad­ing, this should be tak­en up with depart­ment lead­er­ship, the respon­si­ble alder­per­son, and coun­cil members.

Example 2: EV Charching

Or, let’s say the same city gov­ern­ment uses an algo­rith­mic sys­tem to opti­mize pub­lic elec­tric vehi­cle (EV) charg­ing sta­tions for green ener­gy use by adapt­ing charg­ing speeds to expect­ed sun and wind. EV dri­vers may want to know how much ener­gy has been shift­ed to green­er time win­dows and its trends. With­out such vis­i­bil­i­ty on a sys­tem’s actu­al goal achieve­ment, cit­i­zens’ abil­i­ty to legit­i­mate its use suf­fers. As I have already men­tioned, demo­c­ra­t­ic agency, when enact­ed via the appeal to author­i­ty, depends on access to “nor­ma­tive facts” that under­pin poli­cies. And final­ly, pro­fessed sys­tem func­tion­al­i­ty must be demon­strat­ed as well (Raji et al., 2022).

DevOps as sociotechnical leverage point for democratizing AI

These brief exam­ples show that the DevOps approach is a poten­tial sociotech­ni­cal lever­age point. It offers path­ways for democ­ra­tiz­ing AI sys­tem design, devel­op­ment, and operations. 

DevOps can be adapt­ed to fur­ther con­testa­bil­i­ty. It cre­ates new chan­nels between human and machine actors. One of DevOp­s’s essen­tial activ­i­ties is mon­i­tor­ing (Smith, 2020), which pre­sup­pos­es fal­li­bil­i­ty, a nec­es­sary pre­con­di­tion for con­testa­bil­i­ty. Final­ly, it requires and pro­vides infra­struc­ture for tech­ni­cal flex­i­bil­i­ty so that recov­ery from error is low-cost and con­tin­u­ous improve­ment becomes prac­ti­cal­ly feasible.

The mutual shaping of democratic practices & AI

Zoom­ing out fur­ther, let’s reflect on this pan­el’s over­all theme, pick­ing out three ele­ments: legit­i­ma­tion, rep­re­sen­ta­tion of mar­gin­al­ized groups, and deal­ing with con­flict and con­tes­ta­tion after imple­men­ta­tion and dur­ing use.

Con­testa­bil­i­ty is a lever for demand­ing jus­ti­fi­ca­tions from oper­a­tors, which is a nec­es­sary input for legit­i­ma­tion by sub­jects (Henin & Le Métay­er, 2022). Con­testa­bil­i­ty frames dif­fer­ent actors’ stances as adver­sar­i­al posi­tions on a polit­i­cal field rather than “equal­ly valid” per­spec­tives (Scott, 2023). And final­ly, rela­tions, mon­i­tor­ing, and revis­abil­i­ty are all ways to give voice to and enable respon­sive­ness to con­tes­ta­tions (Genus & Stir­ling, 2018).

And again, all of these things can be fur­thered in the post-deploy­ment phase by adapt­ing the DevOps lens.

Bibliography

  • Alfrink, K., Keller, I., Kortuem, G., & Doorn, N. (2022). Con­testable AI by Design: Towards a Frame­work. Minds and Machines33(4), 613–639. https://doi.org/10/gqnjcs
  • Alfrink, K., Keller, I., Yur­ri­ta Sem­per­e­na, M., Buly­gin, D., Kortuem, G., & Doorn, N. (2024). Envi­sion­ing Con­testa­bil­i­ty Loops: Eval­u­at­ing the Ago­nis­tic Are­na as a Gen­er­a­tive Metaphor for Pub­lic AIShe Ji: The Jour­nal of Design, Eco­nom­ics, and Inno­va­tion10(1), 53–93. https://doi.org/10/gtzwft
  • Geiger, R. S., Tan­don, U., Gakhokidze, A., Song, L., & Irani, L. (2023). Mak­ing Algo­rithms Pub­lic: Reimag­in­ing Audit­ing From Mat­ters of Fact to Mat­ters of Con­cern. Inter­na­tion­al Jour­nal of Com­mu­ni­ca­tion18(0), Arti­cle 0.
  • Genus, A., & Stir­ling, A. (2018). Collingridge and the dilem­ma of con­trol: Towards respon­si­ble and account­able inno­va­tion. Research Pol­i­cy47(1), 61–69. https://doi.org/10/gcs7sn
  • Gilbert, T. K., Lam­bert, N., Dean, S., Zick, T., Snoswell, A., & Mehta, S. (2023). Reward Reports for Rein­force­ment Learn­ing. Pro­ceed­ings of the 2023 AAAI/ACM Con­fer­ence on AI, Ethics, and Soci­ety, 84–130. https://doi.org/10/gs9cnh
  • Henin, C., & Le Métay­er, D. (2022). Beyond explain­abil­i­ty: Jus­ti­fi­a­bil­i­ty and con­testa­bil­i­ty of algo­rith­mic deci­sion sys­tems. AI & SOCIETY37(4), 1397–1410. https://doi.org/10/gmg8pf
  • Him­mel­re­ich, J. (2022). Against “Democ­ra­tiz­ing AI.” AI & SOCIETYhttps://doi.org/10/gr95d5
  • Lam­p­land, M., & Star, S. L. (Eds.). (2008). Stan­dards and Their Sto­ries: How Quan­ti­fy­ing, Clas­si­fy­ing, and For­mal­iz­ing Prac­tices Shape Every­day Life (1st edi­tion). Cor­nell Uni­ver­si­ty Press.
  • Peter, F. (2020). The Grounds of Polit­i­cal Legit­i­ma­cy. Jour­nal of the Amer­i­can Philo­soph­i­cal Asso­ci­a­tion6(3), 372–390. https://doi.org/10/grqfhn
  • Raji, I. D., Kumar, I. E., Horowitz, A., & Selb­st, A. (2022). The Fal­la­cy of AI Func­tion­al­i­ty. 2022 ACM Con­fer­ence on Fair­ness, Account­abil­i­ty, and Trans­paren­cy, 959–972. https://doi.org/10/gqfvf5
  • Rubel, A., Cas­tro, C., & Pham, A. K. (2021). Algo­rithms and auton­o­my: The ethics of auto­mat­ed deci­sion sys­tems. Cam­bridge Uni­ver­si­ty Press.
  • Scott, D. (2023). Diver­si­fy­ing the Delib­er­a­tive Turn: Toward an Ago­nis­tic RRISci­ence, Tech­nol­o­gy, & Human Val­ues48(2), 295–318. https://doi.org/10/gpk2pr
  • Smith, J. D. (2020). Oper­a­tions anti-pat­terns, DevOps solu­tions. Man­ning Publications.
  • Treveil, M. (2020). Intro­duc­ing MLOps: How to scale machine learn­ing in the enter­prise (First edi­tion). O’Reilly.