Home / Chroniques / What are the markers of trust for generative AI?
Généré par l'IA / Generated using AI
π Digital

What are the markers of trust for generative AI?

Christophe Gaie
Christophe Gaie
Head of the Engineering and Digital Innovation Division at the Prime Minister's Office
Jean LANGLOIS-BERTHELOT
Jean Langlois-Berthelot
Doctor of Applied Mathematics and Head of Division in the French Army
Key takeaways
  • Generative AI can transform a complex mass of data into fluid, intelligible text in just a few clicks.
  • However, the AI’s interpretation depends on its algorithmic model.
  • Today, the presence of a form of “algorithmic audit” would make it possible to see the entire chain of calculations, from the raw data to the final output.
  • Trustworthy safeguards are needed, such as traceability of algorithmic choices, model stress tests, and access to a guaranteed minimum level of explainability.
  • Training is also essential for digital players, particularly to develop the ability to formulate demanding and critical questions about AI models.

On 30th Novem­ber 2022, with the launch of Chat­G­PT to the gen­er­al pub­lic1, gen­er­at­ive AI left the labor­at­ory and entered meet­ing rooms, fin­an­cial ser­vices, hos­pit­als, schools, and more. The main advant­age of this tech­no­logy is well known – with just a few clicks, it can trans­form a mass of data into flu­id, intel­li­gible text. Today, with this tool, a fin­an­cial dir­ect­or can obtain an auto­mat­ic com­ment­ary on his mar­gins in a mat­ter of seconds, a doc­tor can obtain a report based on exam­in­a­tions, and a stu­dent can gen­er­ate an essay from a simple statement.

This con­veni­ence and ease of use are a game-changer. Where busi­ness intel­li­gence mainly pro­duced fig­ures and graphs, gen­er­at­ive mod­els add a lay­er of inter­pret­a­tion. They pri­or­it­ise sig­nals, offer explan­a­tions and some­times sug­gest fore­casts. How­ever, a clear nar­rat­ive gives the impres­sion of obvi­ous­ness: the con­clu­sion seems robust because it is well for­mu­lated, even though it is based on just one mod­el among many2.

The risk lies not in the use of AI, but in the excess­ive cred­ib­il­ity giv­en to texts for which we often do not know the con­di­tions of pro­duc­tion. In oth­er words, can we decide on an invest­ment of sev­er­al mil­lion pounds or make a med­ic­al dia­gnos­is based on the recom­mend­a­tions and inter­pret­a­tions of gen­er­at­ive AI?

Re-examining the trust given

The trust giv­en to a numer­ic­al response is usu­ally based on two con­di­tions: the qual­ity of the source data and the trans­par­ency of the cal­cu­la­tion meth­od. How­ever, in the case of a lit­er­al response such as that pro­duced with gen­er­at­ive AI, a third lay­er is added: the inter­pret­a­tion of the mod­el3.

Indeed, the mod­el decides what to high­light, dis­cards cer­tain ele­ments and impli­citly com­bines vari­ables. The final product is an auto­mated nar­rat­ive that bears the mark of invis­ible stat­ist­ic­al and lin­guist­ic choices. These choices may be related to the fre­quency of the data used to build the mod­el, prob­lem-solv­ing meth­ods or any oth­er cause. To ensure con­fid­ence in the answer giv­en, these steps should be audit­able, i.e. indic­ated by the user, who can then veri­fy them.

It is now import­ant to ima­gine a form of algorithmic audit, no longer just veri­fy­ing data but con­trolling the entire chain 

This solu­tion, which allows for veri­fic­a­tion, already exists in sim­il­ar situ­ations. First of all, show­ing the thought pro­cess is a com­mon approach in teach­ing math­em­at­ics, as it allows the teach­er to ensure that the stu­dent has under­stood the steps involved in the reas­on­ing. Sim­il­arly, in fin­an­cial ana­lys­is, audits are used to veri­fy com­pli­ance with account­ing rules. Fin­an­cial audits guar­an­tee that the pub­lished fig­ures cor­res­pond to a meas­ur­able reality.

Thus, it is now neces­sary to ima­gine a form of “algorithmic audit”: no longer just veri­fy­ing data but con­trolling the entire chain that leads from the raw flow to the final nar­rat­ive. Take the example of a hos­pit­al where gen­er­at­ive AI sum­mar­ises patient records. If it sys­tem­at­ic­ally omits cer­tain clin­ic­al para­met­ers deemed rare, it pro­duces attract­ive but incom­plete reports. The audit must there­fore test the robust­ness of the mod­el, assess its abil­ity to repro­duce atyp­ic­al cases and veri­fy the trace­ab­il­ity of sources. Sim­il­arly, an auto­mat­ic energy report that ignores abnor­mal con­sump­tion peaks can give a false impres­sion of sta­bil­ity. Here again, the audit must ensure that anom­alies are taken into account.

Technical protocols to be optimised and deployed more widely

Trust-based engin­eer­ing can­not rely solely on declar­a­tions of prin­ciple. It must be trans­lated into spe­cif­ic pro­to­cols. A num­ber of approaches are already emerging:

  • Trace­ab­il­ity of algorithmic choices: each indic­at­or must be linked to the source data and the pro­cessing applied. This involves doc­u­ment­ing trans­form­a­tions, as we cur­rently doc­u­ment a sup­ply chain. Cir­cuit tra­cing meth­ods can provide trace­ab­il­ity that is under­stand­able to humans4. Trace­ab­il­ity then becomes an edu­ca­tion­al tool as well as a con­trol mechanism.
  • Mod­el stress tests: expos­ing AI to unusu­al scen­ari­os to meas­ure its abil­ity to reflect uncer­tainty rather than smooth it out. For example, it is very use­ful to use a sample that does not fol­low the clas­sic dis­tri­bu­tion to check wheth­er the deep AI mod­el has been integ­rated, inde­pend­ently of the test set provided5. This could involve provid­ing a set of lung X‑rays from smokers only. This makes it pos­sible to veri­fy that the AI does not gen­er­ate an excess of ‘false neg­at­ives’ to return to a stat­ist­ic­al average.
  • Min­im­um explain­ab­il­ity guar­an­teed: Without reveal­ing the algorithmic secrets of com­pan­ies provid­ing AI solu­tions, it is envis­aged that they will be asked to provide at least a sum­mary of the main vari­ables used in their mod­els to reach a con­clu­sion. This explain­ab­il­ity could be sub­ject to either ISO-type cer­ti­fic­a­tion for AI qual­ity or val­id­a­tion by a reg­u­lat­ory body (prefer­ably an exist­ing one so as not to mul­tiply the num­ber of authorities).

These meth­ods will not remove the con­fid­en­ti­al­ity asso­ci­ated with the spe­cif­ic and dif­fer­en­ti­at­ing set­tings of the man­u­fac­tur­ers who devel­op large lan­guage mod­els, but they will reduce the risk of blind­ness and unjus­ti­fied con­fid­ence. The issue is not to make AI com­pletely trans­par­ent, but to cre­ate suf­fi­cient safe­guards to main­tain trust.

An organisational culture in need of transformation

Bey­ond the tech­nic­al dimen­sion, it is neces­sary to pro­mote a major cul­tur­al shift. For dec­ades, organ­isa­tions have been accus­tomed to view­ing fig­ures as cer­tain­ties. Dash­boards are often per­ceived as indis­put­able truths. With gen­er­at­ive AI and its exten­sion to lit­er­al and sub­ject­ive pro­duc­tions, this stance is becom­ing untenable.

Decision-makers, as well as all digit­al stake­hold­ers, must learn to read an auto­mat­ic report as a stat­ist­ic­al response based on known or unknown assump­tions, and above all not as a defin­it­ive con­clu­sion. This means train­ing users of AI solu­tions to for­mu­late demand­ing requests (ask­ing for the AI’s ‘reas­on­ing’ pro­cess) and to read responses crit­ic­ally: identi­fy­ing mar­gins of error, ques­tion­ing omis­sions, and ask­ing for altern­at­ive scen­ari­os. In oth­er words, rein­tro­du­cing uncer­tainty into the very heart of the decision-mak­ing process.

The European Uni­on has begun to lay the ground­work with the AI Act, which clas­si­fies the use of AI in fin­ance and pub­lic gov­ernance as ‘high risk’. This reg­u­la­tion imposes an oblig­a­tion of trans­par­ency and audit­ab­il­ity. But the law will not be enough if organ­isa­tions do not cul­tiv­ate act­ive vigil­ance. Gen­er­at­ive AI must be con­trolled not only by stand­ards, but also by a daily prac­tice of crit­ic­al reading.

Moving towards a measure of vigilance

Gen­er­at­ive AI is neither a mirage nor a pan­acea. It speeds up access to inform­a­tion and provides clar­ity on volumes of data that are unman­age­able for humans, but it also trans­forms our rela­tion­ship with decision-mak­ing. Where we used to see num­bers, we now read stories.

The chal­lenge is there­fore not to turn back the clock, but to invent a new engin­eer­ing of trust. Trace­ab­il­ity of cal­cu­la­tions, stress tests, min­im­al explain­ab­il­ity: these are all tech­nic­al build­ing blocks that need to be put in place, bear­ing in mind that an AI mod­el is likely to be the tar­get of mul­tiple cyber­at­tacks 6,7.

But the key lies in adopt­ing a new organ­isa­tion­al cul­ture: accept­ing that uncer­tainty is a giv­en and not a fail­ure of the sys­tem. Only then can gen­er­at­ive AI become a reli­able tool to sup­port human decision-mak­ing, rather than a pro­du­cer of illus­ory certainties.

1Mesko B. The Chat­G­PT (Gen­er­at­ive Arti­fi­cial Intel­li­gence) Revolu­tion Has Made Arti­fi­cial Intel­li­gence Approach­able for Med­ic­al Pro­fes­sion­als. J Med Inter­net Res. 2023 Jun 22;25:e48392. doi: 10.2196/48392. PMID: 37347508; PMCID: PMC10337400
2
Bandi, A., Adapa, P. V. S. R., & Kuchi, Y. E. V. P. K. (2023). The Power of Gen­er­at­ive AI: A Review of Require­ments, Mod­els, Input–Output Formats, Eval­u­ation Met­rics, and Chal­lenges. Future Inter­net, 15(8), 260.
https://​doi​.org/​1​0​.​3​3​9​0​/​f​i​1​5​0​80260
3Hoff­man, Dav­id A. and Arbel, Yonath­an, « Gen­er­at­ive Inter­pret­a­tion » (2024). Art­icles. 417. https://​schol​ar​ship​.law​.upenn​.edu/​f​a​c​u​l​t​y​_​a​r​t​i​c​l​e​s/417
4Nouira, S. (2025, March 31). Com­ment fonc­tionne vraiment une IA ? Les cher­ch­eurs d’Anthropic ont enfin un début de réponse. Les Numériques. https://​www​.les​nu​meriques​.com/​i​n​t​e​l​l​i​g​e​n​c​e​-​a​r​t​i​f​i​c​i​e​l​l​e​/​c​o​m​m​e​n​t​-​f​o​n​c​t​i​o​n​n​e​-​v​r​a​i​m​e​n​t​-​u​n​e​-​i​a​-​l​e​s​-​c​h​e​r​c​h​e​u​r​s​-​d​-​a​n​t​h​r​o​p​i​c​-​o​n​t​-​e​n​f​i​n​-​u​n​-​d​e​b​u​t​-​d​e​-​r​e​p​o​n​s​e​-​n​2​3​4​9​7​8​.html
5
Eche, T., Schwartz, L. H., Mokrane, F., & Dercle, L. (2021). Toward gen­er­al­iz­ab­il­ity in the deploy­ment of arti­fi­cial intel­li­gence in radi­ology: Role of com­pu­ta­tion stress Test­ing to over­come under­spe­cific­a­tion. Radi­ology Arti­fi­cial Intel­li­gence, 3(6).
https://​doi​.org/​1​0​.​1​1​4​8​/​r​y​a​i​.​2​0​2​1​2​10097
6Zhang, C., Jin, M., Shu, D., Wang, T., Liu, D., & Jin, X. (2024). Tar­get-driv­en attack for large lan­guage mod­els. In Fron­ti­ers in arti­fi­cial intel­li­gence and applic­a­tions. https://​doi​.org/​1​0​.​3​2​3​3​/​f​a​i​a​2​40685
7
Esmradi, A., Yip, D.W., Chan, C.F. (2024). A Com­pre­hens­ive Sur­vey of Attack Tech­niques, Imple­ment­a­tion, and Mit­ig­a­tion Strategies in Large Lan­guage Mod­els. In: Wang, G., Wang, H., Min, G., Geor­galas, N., Meng, W. (eds) Ubi­quit­ous Secur­ity. Ubi­Sec 2023. Com­mu­nic­a­tions in Com­puter and Inform­a­tion Sci­ence, vol 2034. Spring­er, Singa­pore.
https://doi.org/10.1007/978–981-97–1274-8_6

Support accurate information rooted in the scientific method.

Donate