ChatGPT, Midjourney: everything you need to know about generative AI

It’s all the talk these days, and ChatGPT is arriving in our societies like a veritable revolution. So, it is hardly surprising that, given the wide-ranging applications of these tools, their arrival is fuelling so much debate. But do we really know how this AI works?

A generative AI can generate written, visual, or audible content by ingesting content. By giving it indications as input, the AI can create as output any content that corresponds to the indications ingested. “Here, we’re looking to generate original content,” explains Éric Moulines, professor of static learning at École Polytechnique (IP Paris). “This original content will be obtained by generalising the information seen during learning”.

There are currently two main types of AI model. GPTs (Generative Pre-trained Transformers), such as ChatGPT, and Diffusion Models. “By giving it text as input, the AI will be able to understand the context through a mechanism called attention,” adds Hatim Bourfone, AI research engineer at IDRIS (CNRS). “Its output will therefore be a list of all the words in the dictionary that it knows [learned during training phase] on which it will have placed a probability”. Depending on the database it has trained on, the tool can be programmed for various functions.

Bloom, for example, the AI developed by the team of which Hatim Bourfoune is a member at IDRIS, is a tool that helps researchers to express themselves in several languages. “The primary aim of the Bloom model,” adds Pierre Cornette, also a member of the IDRIS team, “is to learn a language. To do this, we give them a whole bunch of texts to ingest, asking them to predict the next word in the given text, and we correct them if they get it wrong”.

A recent, still immature technology

“The first generative AI models are not even 10 years old,” explains Éric Moulines. “The first revolution in this field was the arrival of transformers – a technology perfecting this attention mechanism – in 2017. Four years later, we already have commercial products. So, there has been considerable acceleration, much faster than on any other Deep Learning model.” Models like ChatGPT are therefore still very new, and there are still many things that can, or must, be improved.

The question of the reliability of the answers given is still not certain: “ChatGPT is not familiar with the notion of reliability”, admits the professor. “This type of AI is incapable of assessing the veracity of the answers it gives.” This leaves room for an easily observable phenomenon known as ‘hallucinations’. “It is possible [for ChatGPT] to generate content that seems plausible, but is rigorously false,” he adds. “It uses completely probabilistic reasoning to generate sequences of words. Depending on the context, it will generate strings of words that seem the most likely.”

Apart from its ability to invent book titles, other limitations should be borne in mind when using it. By applying Deep Learning methods, these AIs go through a training phase during which they ingest a quantity of existing texts. In this way, they will incorporate the biases of this database into their learning. Geopolitical questions are a good example of this. “If you ask it geopolitical questions, ChatGPT will essentially reflect the Western world,” says Éric Moulines. “If we show the answers given to a Chinese person, he will certainly not agree with what is said about the sovereignty of such and such a country over a given territory.”

A range of applications

Each model will therefore be able to generate content according to the database it has been trained on. This is perhaps where the magic of this technology lies, because, knowing this, a myriad of applications can be created. “A good analogy for this technology would be that of an engine,” says Pierre Cornette. “You could have a very powerful engine, but it can be used for either a tractor or a racing car.” For example, ChatGPT is a racing car, and its engine is GPT‑4. “The advantage is that the technologies are concentrated in what is the engine,” he continues, “and you don’t need to understand how it works to use the race car.”

Bloom is an example of another use for this type of model: “A year ago, Bloom was one of the only models that was completely open to research,” insists Hatim Bourfoune. In other words, anyone could download the model and use it for their own research. Trained with a database of various scientific articles in many languages, this model can be extremely useful for scientific research. Pierre Cornette adds: “There is also another Bigcode project, run by the same people, which promotes a model specialising in computer code. We ask it for a function, simply describing its action, and it can write it for us in the desired language.”

The popularity of ChatGPT shows just how important it is for the general public. Bing has also integrated it into its search engine with a view to competing with Google. This integration makes it possible to counter one of the limitations of this technology: the reliability of the answers given. By giving the sources used to compile its response, the search engine enables us to understand and verify them better. Even more recently, Adobe has integrated a generative AI model into various software applications in its suite (such as Photoshop and Illustrator), revealing yet another impressive application of this technology.

“An exciting future”

All this can only mean an exciting future for this innovation. However, the range of applications raises questions about its possible uses. “As with all tools, there can be malicious uses,” admits Hatim Bourfoune. “That’s why companies like OpenAI put up different security barriers.” Today, many of the questions put to ChatGPT remain unanswered, because the AI believes that they violate its content policy.

Even so, this technology is still in its infancy. “That’s the principle of research – we’re still at ground zero,” says Éric Moulines. “It’s amazing that it even works.” There are still many loopholes to be filled, particularly from a legal point of view. As explained, the content generated by these tools will be built using an existing database. The AI will therefore “copy” existing texts or works without citing their original author. “This poses a major problem,” he continues, “because the rights holders of the content used to generate these new images [or texts] are not respected.”

Despite its various limitations, the potential remains enormous: “What excites me … is that the progress to be made is enormous,” adds the professor. “But the trend and the derivatives are enormous. It’s happening very quickly and there’s very exciting competition in these subjects.” Speaking of derivatives, Bloom illustrates this perfectly. Useful for research, it is also a linguistic tool that could make it possible to save dead languages, but also to translate scientific texts into lesser-spoken languages to facilitate the dissemination of research.

However, its “exciting” future may be hampered by its considerable carbon impact. “These models require a lot of memory, because they need to store a huge amount of data,” explains Éric Moulines. “Today, we estimate that OpenAI consumes as much memory as the grid in a country like Belgium.” This is the problem that will surely be the most complicated to solve.

ChatGPT, Midjourney: everything you need to know about generative AI

A recent, still immature technology

A range of applications

“An exciting future”

Pablo Andres

More episodes here

The ways AI will change the future of work

Demystifying generative AI: true, false, uncertain

4 myths surrounding generative AI

Our selection of braincamps

Generative AI: threat or opportunity?

Batteries: the challenges of energy storage multiply

Nobel Prizes: what applications for the work of the latest winners?