Home / Chroniques / Why we need ‘explainable’ AI
37,6
π Digital π Science and technology

Why we need ‘explainable’ AI

Véronique Steyer
Véronique Steyer
Associate Professor in Innovation Management at École Polytechnique (IP Paris)
Louis Vuarin
Louis Vuarin
Postdoctoral fellow in SES department of Telecom Paris (I3-CNRS, IP Paris)
Sihem BenMahmoud-Jouini
Sihem Ben Mahmoud-Jouini
Associate Professor of Innovation Management at HEC and researcher at GREGHEC

The more AI algorithms spread, the more their opa­city raises ques­tions. In the Face­book files, Frances Hau­gen, a spe­cial­ist in algorithmic rank­ings, took intern­al doc­u­ments with her when she left the com­pany and sent them to the Amer­ic­an reg­u­lat­ory agency and Con­gress. In doing so, she claims to reveal an algorithmic sys­tem that has become so com­plex that it is bey­ond the con­trol of the very engin­eers who developed it. While the role of the social net­work in the propaga­tion of dan­ger­ous con­tent and fake news is wor­ry­ing, the inab­il­ity of its engin­eers, who are among the best in the world in their field, to under­stand how their cre­ation works is ques­tion­able. The case once again puts the spot­light on the prob­lems posed by the (in)explainability of these algorithms that pro­lif­er­ate in many industries.

Understanding “opaque algorithms”

The opa­city of algorithms does not con­cern all AIs. Rather, it is espe­cially char­ac­ter­ist­ic of so-called learn­ing algorithms known as machine learn­ing, par­tic­u­larly deep learn­ing algorithms such as neur­al net­works. How­ever, giv­en the grow­ing import­ance of these algorithms in our soci­et­ies, it is not sur­pris­ing that the con­cern for their explain­ab­il­ity – our abil­ity to under­stand how they work – is mobil­ising reg­u­lat­ors and eth­ics committees.

How can we trust the sys­tems to which we are going to entrust our lives? For example, AI may soon be driv­ing our vehicles or dia­gnos­ing a ser­i­ous ill­ness. And, more broadly, the future of our soci­et­ies through the inform­a­tion that cir­cu­lates, the polit­ic­al ideas that are propag­ated, the dis­crim­in­a­tion that lurks? How can we attrib­ute respons­ib­il­ity for an acci­dent or a harm­ful side effect if it is not pos­sible to open the ‘black box’ that is an AI algorithm?

With­in the organ­isa­tions that devel­op and use these opaque algorithms, this quest for explain­ab­il­ity basic­ally inter­twines two issues, which it is use­ful to dis­tin­guish. On the one hand, it is a ques­tion of explain­ing AI so that we may under­stand it (i.e. inter­pretab­il­ity), with the aim of improv­ing the algorithm, per­fect­ing its robust­ness and fore­see­ing its flaws. On the oth­er hand, we are look­ing to explain AI in order to identi­fy who is account­able for the func­tion­ing of the sys­tem to the many stake­hold­ers, both extern­al (reg­u­lat­ors, users, part­ners) and intern­al to the organ­isa­tion (man­agers, pro­ject lead­ers). This account­ab­il­ity is cru­cial so that respons­ib­il­ity for the effects – good or not so good – of these sys­tems can be attrib­uted and assumed by the right person(s).

Opening the “black box”

A whole field of research is there­fore devel­op­ing today to try to meet the tech­nic­al chal­lenge of open­ing the ‘black box’ of these algorithms: Explain­able Arti­fi­cial Intel­li­gence (XAI). But is it enough to make these sys­tems trans­par­ent to meet the dual chal­lenge of explain­ab­il­ity: inter­pretab­il­ity and account­ab­il­ity? In our dis­cip­line, man­age­ment sci­ences, work is being done on the lim­its of trans­par­ency. We stud­ied the way in which indi­vidu­als with­in organ­isa­tions take hold1 of the explan­at­ory tools that have been developed2, such as heat maps, as well as per­form­ance curves, often asso­ci­ated with algorithms. We have sought to under­stand the way organ­isa­tions use them.

Indeed, these two types of tools seem to be in the pro­cess of being stand­ard­ised to respond to the pres­sure from reg­u­lat­ors to explain algorithms; as illus­trated by art­icle L4001‑3 of the pub­lic health code: « III… The design­ers of an algorithmic treat­ment… ensure that its oper­a­tion is explain­able for users » (law no. 2021–1017 of 2 August 2021). Our ini­tial res­ults high­light the ten­sions that arise with­in organ­isa­tions around these tools. Let us take two typ­ic­al examples.

Example #1: Explaining reasoning behind a particular decision

The first example illus­trates the ambi­gu­ities of these tools when by busi­ness experts who are not famil­i­ar with how they work. Fig­ure 1, taken from Whebe et al (2021), shows heat maps used to explain the oper­a­tion of an algorithm developed to detect Cov­id-19 from lung x‑rays. This sys­tem per­forms very well in tests, com­par­able to, and often bet­ter than, that of a human radi­olo­gist. Study­ing these images, we note that in cases A, B, and C, these heat maps tell us that the AI was pro­nounced from an area cor­res­pond­ing to the lungs. In case D, how­ever, the illu­min­ated area is loc­ated around the patient’s collarbone.

Fig­ure 1: Heat maps – here from Deep­COV­ID-XR, an AI algorithm trained to detect covid19 on chest X‑rays. Source: Wehbe, R. M., Sheng, J., Dutta, S., Chai, S., Dravid, A., Barutcu, S., … & Kat­saggelos, A. K. (2021). Deep­COV­ID-XR: An Arti­fi­cial Intel­li­gence Algorithm to Detect COVID-19 on Chest Radio­graphs Trained and Tested on a Large US Clin­ic­al Data­set. Radi­ology, 299 (1), p.E 173

This obser­va­tion high­lights a prob­lem often found in our study of the use of these tools in organ­isa­tions: designed by developers to spot inad­equate AI reas­on­ing, these tools are often, in prac­tice, used to make judge­ments about the valid­ity of a par­tic­u­lar case. How­ever, these heat maps are not designed for this pur­pose, and for an end user it is not the same thing to say: “this AI’s dia­gnos­is seems accur­ate” or “the pro­cess by which it goes about determ­in­ing its dia­gnos­is seems valid.”

Two prob­lems are then high­lighted: firstly, there is the ques­tion of train­ing users, not only in AI, but also in the work­ings of these explan­at­ory tools. And more fun­da­ment­ally, this example raises ques­tions about the use of these tools in assign­ing respons­ib­il­ity for a decision: by dis­sem­in­at­ing them, there is a risk of lead­ing users to take respons­ib­il­ity on the basis of tools that being used for decisions oth­er than their inten­ded pur­pose. In oth­er words, these tools, cre­ated for inter­pretab­il­ity, risk being used in situ­ations where account­ab­il­ity is at stake.

Example #2: Explaining or evaluating performance 

A second example illus­trates effects of organ­isa­tion­al games played around explan­at­ory tools such as per­form­ance curves. For example, a developer work­ing in the Inter­net advert­ising sec­tor told us that, in his field, if the per­form­ance of the algorithm did not reach more than 95%, cus­tom­ers would not buy the solu­tion. This stand­ard had con­crete effects on the innov­a­tion pro­cess: it led her com­pany, under pres­sure from the sales force, to choose to com­mer­cial­ise and focus its R&D efforts on a very effi­cient but not very explain­able algorithm. In doing so, she aban­doned the devel­op­ment of anoth­er, more explain­able, but not as power­ful algorithm.

The prob­lem was that the algorithm they chose to com­mer­cial­ise was not cap­able of hand­ling rad­ic­al changes in con­sumer habits, such as dur­ing Black Fri­day. This caused the com­pany not to recom­mend cus­tom­ers to engage in any mar­ket­ing cam­paigns at this time, as it was dif­fi­cult to recog­nise that the algorithm’s per­form­ance was fall­ing. Again, the tool is mis­used: instead of being used as one ele­ment among oth­ers to make the choice of the best algorithm for a giv­en activ­ity, these per­form­ance curves were used by buy­ers to jus­ti­fy the choice of a pro­vider against an industry stand­ard, In oth­er words: for accountability.

The “man-machine-organisation” interaction

For us, research­ers in man­age­ment sci­ences, inter­ested in the pro­cesses of decision-mak­ing and innov­a­tion, the role of tools to sup­port them, and the way in which act­ors legit­im­ise the choices they make with­in their organ­isa­tions, these two examples show that it is not enough to look at these com­pu­ta­tion­al tools in isol­a­tion. They call for an approach to algorithms and their human, organ­isa­tion­al and insti­tu­tion­al con­texts as assemblages. Hold­ing such an assemblage account­able then requires not just see­ing inside one of its com­pon­ents as the algorithm – a dif­fi­cult and neces­sary task – but under­stand­ing how the assemblage func­tions as a sys­tem, com­posed of machines, humans, and organ­isa­tion. And, con­sequently, to be wary of a too rap­id stand­ard­isa­tion of explic­ab­il­ity tools, which would res­ult in man-machine-organ­isa­tion­al game sys­tems, in which the imper­at­ive to be account­able would replace that of inter­pret­ing, mul­tiply­ing black boxes of which explic­ab­il­ity would only be a façade.

1A heat map visu­ally indic­ates which part of the image (which pixels) was used by the algorithm to pre­scribe a decision and in par­tic­u­lar to clas­si­fy the image into a cer­tain cat­egory (see fig­ure 1).
2A per­form­ance curve is a 2D graph rep­res­ent­ing the math­em­at­ic­al accur­acy of the pre­dic­tion as a func­tion of the sens­it­iv­ity of a sig­ni­fic­ant para­met­er (such as the num­ber of images provided to the AI to per­form its ana­lys­is, or the time in mil­li­seconds allowed to per­form the cal­cu­la­tions, etc.). This curve is then com­pared to points indic­at­ing human per­form­ance.

Contributors

Véronique Steyer

Véronique Steyer

Associate Professor in Innovation Management at École Polytechnique (IP Paris)

Véronique Steyer holds a Master's degree in Management from ESCP Europe, a PhD in Management Sciences from the University of Paris Ouest and a PhD from ESCP Europe. Véronique Steyer was a visiting researcher at the University of Warwick (UK), and also has experience as a management consultant. Her research focuses on how individual actions and interactions between actors shape organisational phenomena and, more broadly, contribute to shaping the society in which we live. Her work focuses on sensemaking in situations of change: transformations induced by the introduction of new technologies such as AI, perception of a crisis or a risk (pandemic, ecological transition).

Louis Vuarin

Louis Vuarin

Postdoctoral fellow in SES department of Telecom Paris (I3-CNRS, IP Paris)

A former risk manager and graduate of the ESCP Master in Management, Louis Vuarin holds a PhD in Management Sciences from ESCP since 2020. He was a visiting researcher at SCORE in Stockholm and a post-doctoral fellow at CRG (Ecole Polytechnique). His current research focuses on epistemic processes within organisations, in particular on the question of the explicability of artificial intelligence (XAI) and its organisational dimension, as well as on the phenomena of resistance to innovation in the context of 5G.

Sihem BenMahmoud-Jouini

Sihem Ben Mahmoud-Jouini

Associate Professor of Innovation Management at HEC and researcher at GREGHEC

Sihem BenMahmoud-Jouini is the Scientific Director of the X-HEC Entrepreneurs MS, the Innovation Design Project (IDP) and Entrepreneurship & Innovation (Executive MBA). Her research focuses on the management of disruptive innovations in established companies and the organisation of the exploration of new fields. She is also interested in the role of design in innovation and organisational transformation. Her research has been conducted at Assistance Publique des Hôpitaux de Paris, Vinci, Thales, Valeo, Air Liquide, Essilor, Danone and Orange, where she held the Chair.

Support accurate information rooted in the scientific method.

Donate