BIAIS IA
π Digital π Science and technology
What are the next challenges for AI?

Artificial intelligence: a tool for domination or emancipation?

with Lê Nguyên Hoang, PhD in Mathematics (Polytechnique Montréal), Co-Founder and Executive Director of Tournesol.app, Victor Berger, Post-doctoral researcher at CEA Saclay and Giada Pistilli, PhD student at Sorbonne University affiliated with the CNRS Science, Norms, Democracy laboratory
On January 17th, 2023 |
5 min reading time
Lê Nguyên Hoang
Lê Nguyên Hoang
PhD in Mathematics (Polytechnique Montréal), Co-Founder and Executive Director of Tournesol.app
BERGER Victor
Victor Berger
Post-doctoral researcher at CEA Saclay
PISTILLI Giada
Giada Pistilli
PhD student at Sorbonne University affiliated with the CNRS Science, Norms, Democracy laboratory
Key takeaways
  • There are three ways to teach artificial intelligence (AI): supervised learning, unsupervised learning, and reinforcement learning.
  • Machine learning algorithms can spot patterns so the slightest hidden bias in a dataset can therefore be exploited and amplified.
  • Generalising from past experience in AI can be problematic because algorithms use historical data to answer present problems.
  • AI is also a field with a great deal of power: ethical issues such as the use of data can emerge.
  • Communities could take ownership of AI, using it as a true participatory tool for emancipation.

Before tack­ling the issue of AI bias, it is import­ant to under­stand how a machine learn­ing algorithm works, but also what it means. Vic­tor Ber­ger, a post-doc­tor­al fel­low in arti­fi­cial intel­li­gence and machine learn­ing at CEA-List, explains, “the basic assump­tion of most machine learn­ing algorithms is that we have data that is sup­posedly a stat­ist­ic­al rep­res­ent­a­tion of the prob­lem we want to solve.”

Three main ways of learning 

The simplest – tech­nic­ally speak­ing – most com­mon way to teach a machine learn­ing AI is called super­vised learn­ing. “For example, if you have a data­base full of anim­al pic­tures, a super­vised algorithm will already know that a pic­ture rep­res­ents a dog, a cat, a chick­en, etc., and it will know that for this input it should give a spe­cif­ic response in out­put. A clas­sic example of this type of algorithm is lan­guage trans­lat­ors,” explains Vic­tor Berger.

The second cat­egory of algorithms, unsu­per­vised learn­ing, is gen­er­ally used when we do not have the solu­tion to a prob­lem: “to con­tin­ue with the example of anim­als, an unsu­per­vised learn­ing algorithm will con­tain a data­base with the same pho­tos as the pre­vi­ous one, without hav­ing pre­cise instruc­tions on how it should react in out­put with respect to a giv­en input. Its aim is gen­er­ally to identi­fy stat­ist­ic­al pat­terns with­in the data­set it is giv­en for cat­egor­isa­tion (or clus­ter­ing) purposes.”

The whole prob­lem lies in the data sets used to super­vise the algorithms.

The third cat­egory of algorithms is rein­force­ment learn­ing: “In the first two cat­egor­ies, the way the algorithm is coded allows it to dir­ect itself and know how to improve. This com­pon­ent is absent in rein­force­ment learn­ing, where the algorithm just knows wheth­er it has com­pleted its task cor­rectly or not. It has no instruc­tions about which dir­ec­tions to take to become bet­ter. In the end, it is the envir­on­ment and its reac­tion to the algorithm’s decision mak­ing that will act as a guide,” Vic­tor Ber­ger explains.

In all three cases, the prob­lem lies in the data sets used to super­vise the algorithms. Vic­tor Ber­ger reminds us that “machine learn­ing algorithms enable pat­terns to be iden­ti­fied. There­fore, the slight­est bias hid­den in a data set can bias the entire algorithm, which will find the biased pat­tern then exploit and amp­li­fy it. 

Generalisation of data

For Lê Nguyên Hoang, a doc­tor in math­em­at­ics and co-founder of Tournesol, who pop­ular­ises the sub­ject of arti­fi­cial intel­li­gence, the hypo­thes­is of the gen­er­al­isa­tion of data is omni­present in the field of machine learn­ing: “Ques­tions relat­ing to the qual­ity of data are largely under­val­ued. Wheth­er in the research world or in industry, it is the design of algorithms that takes centre stage. But very few people ask them­selves wheth­er gen­er­al­ising the past by train­ing algorithms with his­tor­ic­al data­bases that we don’t look at crit­ic­ally is really a viable pro­ject for society.”

To bet­ter under­stand how this can mani­fest itself, Vic­tor Ber­ger refers to a spe­cif­ic anec­dote cir­cu­lat­ing in the machine learn­ing com­munity: “To avoid gender bias, a com­pany using AI to sort CVs excluded inform­a­tion such as name and pho­tos. But they real­ised that it had retained foot­ball as a rel­ev­ant focus.” As care­ful as the com­pany was, it provided its his­tor­ic­al data without anti­cip­at­ing a pat­tern: the most recruited CVs in the past – those of men – were more likely to include the interest foot­ball. Far from com­bat­ing the gender bias, the algorithm nur­tured it. There are two solu­tions for deal­ing with this type of prob­lem, “either humans are tasked with build­ing more qual­it­at­ive data­bases – but this requires a colossal amount of work – or algorithms are tasked with elim­in­at­ing the biases already iden­ti­fied,” explains Vic­tor Berger. 

But this does not solve all prob­lems. “If we take the example of con­tent mod­er­a­tion, the labelling of data depends on the concept of free­dom of expres­sion that we defend, on what we con­sider to be or not to be a call to hatred or dan­ger­ous false inform­a­tion. There­fore, ques­tions that do not have clear answers and where there will be dis­agree­ments. So, if the prob­lem is not just a tech­nic­al one, the same goes for the solu­tions,” says Lê Nguyên Hoang.

Feedback loops

There are also ques­tions about the feed­back loops which algorithms can cause. “What you have to bear in mind is that a machine learn­ing algorithm is always pre­script­ive, because its aim is to achieve a spe­cif­ic object­ive: max­im­ising pres­ence on a plat­form, profit, click-through rate, etc.” points out Lê Nguyên Hoang.

Ima­gine an algorithm used by a com­munity’s police force to pre­dict in which neigh­bour­hood there will be the most crime and aggres­sion. Vic­tor Ber­ger argues that “what this algorithm is going to do is make a pre­dic­tion based on his­tor­ic­al police data that iden­ti­fies the neigh­bour­hoods where the most people have been arres­ted.” Here again, we fall back on the same flaw: the risk of gen­er­al­ising – or even amp­li­fy­ing – the past. Indeed, this pre­dic­tion is not only descript­ive, but it also leads to decision-mak­ing: increas­ing the num­ber of police officers, increas­ing video sur­veil­lance, etc. Decisions that may lead to the rein­force­ment of a par­tic­u­lar area of the city. Decisions that can lead to the rein­force­ment of an already tense climate.

The phe­nom­ena of rad­ic­al­isa­tion, sec­tari­an move­ments and con­spir­acy circles can be amplified.

Sim­il­arly, on social media and enter­tain­ment plat­forms, recom­mend­a­tion algorithms are based on the user’s pre­vi­ous choices. Their object­ive is gen­er­ally to keep the user’s atten­tion for as long as pos­sible. As a res­ult, the phe­nom­ena of rad­ic­al­isa­tion, sec­tari­an move­ments and con­spir­acy the­or­ies can be amp­li­fied. Lê Nguyên Hoang is work­ing on solv­ing this prob­lem with the help of an algorithm, called Tournesol, whose data­base is built up in a col­lab­or­at­ive man­ner1

Questions of power 

Arti­fi­cial intel­li­gence is there­fore not just a field of sci­entif­ic study or a tech­no­lo­gic­al field of applic­a­tion. It also has a great deal of power at stake. “It is very import­ant to ana­lyse and list the vari­ous social and eth­ic­al prob­lems that can arise from these algorithms, from their train­ing to their design and deploy­ment,” warns Giada Pis­tilli, a philo­sophy research­er and seni­or eth­i­cist at Hug­ging Face.

What exactly are these prob­lems? The philo­sophy research­er explains that they can be found at all levels of the AI devel­op­ment chain. “There can be eth­ic­al prob­lems that emerge as soon as a mod­el is trained because of the data issue: can the data lead to ste­reo­typ­ing? What are the con­sequences of the absence of cer­tain data? Has the data used, such as private images and intel­lec­tu­al prop­erty, been sub­ject to con­sent before being used as a train­ing data­set for the model?”

But this is far from being the only prob­lem­at­ic link in the chain. “Dur­ing devel­op­ment and deploy­ment, ques­tions of gov­ernance arise. Who owns the mod­el, who designs it and for what pur­pose? There is also the ques­tion of the need for cer­tain mod­els in the light of cli­mate change. Run­ning these mod­els con­sumes a great deal of energy. In fact, this high­lights the fact that only power­ful com­pan­ies have the resources to use them,” warns the researcher.

We can make AI a real empower­ment tool that com­munit­ies could take own­er­ship of.

For­tu­nately, the pic­ture is not all black. Arti­fi­cial intel­li­gence can be used as a tool for empower­ment. Giada Pis­tilli is a mem­ber of Big­Science, a col­lab­or­at­ive pro­ject involving thou­sands of aca­dem­ics that aims to devel­op an open access lan­guage mod­el. Accord­ing to her, such pro­jects can make AI robustly bene­fi­cial. “By devel­op­ing AI that is spe­cial­ised on a single task, it can be made more audit­able, par­ti­cip­at­ory, and tailored to the com­munity that will use it. By edu­cat­ing users about these new tech­no­lo­gies and integ­rat­ing them into the pro­ject of build­ing data­bases, we can make AI a real empower­ment tool that com­munit­ies can take own­er­ship of.”

 Will we be able to rise to these mul­tiple chal­lenges? The ques­tion remains.

Julien Hernandez
1https://​www​.futura​-sci​ences​.com/​t​e​c​h​/​a​c​t​u​a​l​i​t​e​s​/​i​n​t​e​l​l​i​g​e​n​c​e​-​a​r​t​i​f​i​c​i​e​l​l​e​-​t​o​u​r​n​e​s​o​l​-​a​l​g​o​r​i​t​h​m​e​-​u​t​i​l​i​t​e​-​p​u​b​l​i​q​u​e​-​b​e​s​o​i​n​-​v​o​u​s​-​8​7301/

Support accurate information rooted in the scientific method.

Donate