01/09/2025
DIGIVATIONS will explore these topics and this summer at DIGIVATIONS INSTITUTE: Technology
Elon Musk says all human data for AI training ‘exhausted’
Tech entrepreneur suggests move to self-learning synthetic data created by artificial intelligence models
Business live – latest updates
10:15 EST Thursday, 09 January 2025
Artificial intelligence companies have run out of data for training their models and have “exhausted” the sum of human knowledge, Elon Musk has said.
The world’s richest person suggested technology firms would have to turn to “synthetic” data – or material created by AI models – to build and fine-tune new systems, a process already taking place with the fast-developing technology.
“The cumulative sum of human knowledge has been exhausted in AI training. That happened basically last year,” said Musk in an interview livestreamed on his social media platform, X.
AI models such as the GPT-4o model powering the ChatGPT chatbot are “trained” on a vast array of data taken from the internet, where they in effect learn to spot patterns in that information – allowing them to predict, for instance, the next word in a sentence.
Musk said the “only way” to counter the lack of source material for training new models was to move to synthetic data created by AI.
For more Guardian journalism follow this channel
Follow the Guardian
Referring to the exhaustion of data troves, he said: “The only way to then supplement that is with synthetic data where … it will sort of write an essay or come up with a thesis and then will grade itself and … go through this process of self-learning.”
Meta, the owner of Facebook and Instagram, has used synthetic data to fine-tune its biggest Llama AI model, while Microsoft has also used AI-made content for its Phi-4 model. Google and OpenAI, the company behind ChatGPT, have also used synthetic data in their AI work.
However, Musk also warned that AI models’ habit of generating “hallucinations” – a term for inaccurate or nonsensical output – was a danger for the synthetic data process.
He told the livestreamed interview with Mark Penn, the chair of the advertising group Stagwell, that hallucinations had made the process of using artificial material “challenging” because “how do you know if it … hallucinated the answer or it’s a real answer”.
High-quality data, and control over it, is one of the legal battlegrounds in the AI boom. OpenAI admitted last year it would be impossible to create tools such as ChatGPT without access to copyrighted material, while the creative industries and publishers are demanding compensation for use of their output in the model training process.
Why you can rely on the Guardian not to bow to Trump – or anyone
I hope you appreciated this article. Before you move on, I wanted to ask whether you could support the Guardian’s journalism as we prepare to cover the second Trump administration.
As Trump himself observed: “The first term, everybody was fighting me. In this term, everybody wants to be my friend.”
He’s not entirely wrong. All around us, media organizations have begun to capitulate. First, two news outlets pulled election endorsements at the behest of their billionaire owners. Next, prominent reporters bent the knee at Mar-a-Lago. And then a major network – ABC News – rolled over in response to Trump’s legal challenges and agreed to a $16m million settlement in his favor.
The Guardian is clear: we have no interest in being Donald Trump’s – or any politician’s – friend. Our allegiance as independent journalists is not to those in power but to the public.
How are we able to stand firm in the face of intimidation and threats? As journalists say: follow the money. The Guardian has neither a self-interested billionaire owner nor profit-seeking corporate henchmen pressuring us to appease the rich and powerful. We are funded by our readers and owned by the Scott Trust – whose only financial obligation is to preserve our journalistic mission in perpetuity.
What’s more, we make our fearless, fiercely independent journalism free to all, with no paywall – so that everyone in the US can have access to responsible, fact-based news.
With the incoming administration boasting about its desire to punish journalists, and Trump and his allies already pursuing lawsuits against newspapers whose stories they don’t like, it has never been more urgent, or more perilous, to pursue fair, accurate reporting. Can you support the Guardian today?
We value whatever you can spare, but a recurring contribution makes the most impact, enabling greater investment in our most crucial, fearless journalism. As our thanks to you, we can offer you some great benefits – including seeing far fewer fundraising messages like this. We’ve made it very quick to set up, so we hope you’ll consider it.
However you choose to support us: thank you for helping protect the free press. Whatever happens in the coming months and years, you can rely on the Guardian never to bow down to power, nor back down from truth.
Betsy Reed
Editor, Guardian US
Elon Musk says all human data for AI training ‘exhausted’ — Guardian US Tech entrepreneur suggests move to self-learning synthetic data created by artificial intelligence models
06/09/2024
04/29/2024