Skip to content
The bengaluru live

The Bengaluru Live | Bengaluru News, Breaking Updates

It's Your Voice

BWSSB 5
Primary Menu
  • HOME
  • Bengaluru
  • Karnataka
  • POLITICS
  • CRIME
  • Delhi
  • Kannada News
  • Hindi News
  • MORE NEWS
    • Supreme Court
    • CITY UPDATES
    • PM
    • SPORTS
    • STATE
    • EDUCATION
    • ENTERTAINMENT
    • HEALTH
    • Real Estate
    • SPORTS
    • CORONA
    • MOBILITY/TRANSPORT
    • Horoscope
  • Home
  • CITY UPDATES
  • What is ‘model collapse’? An expert explains the rumours about an impending AI doom
  • CITY UPDATES
  • STATE

What is ‘model collapse’? An expert explains the rumours about an impending AI doom

19 August 2024 5 minutes read
0
What is ‘model collapse’? An expert explains the rumours about an impending AI doom

What is ‘model collapse’? An expert explains the rumours about an impending AI doom

📘 Read this story in Kannada

Queensland: Artificial intelligence (AI) prophets and newsmongers are forecasting the end of the generative AI hype, with talk of an impending catastrophic “model collapse”.

But how realistic are these predictions? And what is model collapse anyway? Discussed in 2023, but popularised more recently, “model collapse” refers to a hypothetical scenario where future AI systems get progressively dumber due to the increase of AI-generated data on the internet.

The need for data

Modern AI systems are built using machine learning. Programmers set up the underlying mathematical structure, but the actual “intelligence” comes from training the system to mimic patterns in data.

But not just any data. The current crop of generative AI systems needs high quality data, and lots of it.

To source this data, big tech companies such as OpenAI, Google, Meta and Nvidia continually scour the internet, scooping up terabytes of content to feed the machines. But since the advent of widely available and useful generative AI systems in 2022, people are increasingly uploading and sharing content that is made, in part or whole, by AI.

In 2023, researchers started wondering if they could get away with only relying on AI-created data for training, instead of human-generated data.

There are huge incentives to make this work. In addition to proliferating on the internet, AI-made content is much cheaper than human data to source. It also isn’t ethically and legally questionable to collect en masse.

However, researchers found that without high-quality human data, AI systems trained on AI-made data get dumber and dumber as each model learns from the previous one. It’s like a digital version of the problem of inbreeding.

This “regurgitive training” seems to lead to a reduction in the quality and diversity of model behaviour. Quality here roughly means some combination of being helpful, harmless and honest. Diversity refers to the variation in responses, and which people’s cultural and social perspectives are represented in the AI outputs.

In short: by using AI systems so much, we could be polluting the very data source we need to make them useful in the first place.

Avoiding collapse

Can’t big tech just filter out AI-generated content? Not really. Tech companies already spend a lot of time and money cleaning and filtering the data they scrape, with one industry insider recently sharing they sometimes discard as much as 90 per cent of the data they initially collect for training models.

These efforts might get more demanding as the need to specifically remove AI-generated content increases. But more importantly, in the long term it will actually get harder and harder to distinguish AI content. This will make the filtering and removal of synthetic data a game of diminishing (financial) returns.

Ultimately, the research so far shows we just can’t completely do away with human data. After all, it’s where the “I” in AI is coming from.

Are we headed for a catastrophe?

There are hints developers are already having to work harder to source high-quality data. For instance, the documentation accompanying the GPT-4 release credited an unprecedented number of staff involved in the data-related parts of the project.

We may also be running out of new human data. Some estimates say the pool of human-generated text data might be tapped out as soon as 2026.

It’s likely why OpenAI and others are racing to shore up exclusive partnerships with industry behemoths such as Shutterstock, Associated Press and NewsCorp. They own large proprietary collections of human data that aren’t readily available on the public internet.

However, the prospects of catastrophic model collapse might be overstated. Most research so far looks at cases where synthetic data replaces human data. In practice, human and AI data are likely to accumulate in parallel, which reduces the likelihood of collapse.

The most likely future scenario will also see an ecosystem of somewhat diverse generative AI platforms being used to create and publish content, rather than one monolithic model. This also increases robustness against collapse.

It’s a good reason for regulators to promote healthy competition by limiting monopolies in the AI sector, and to fund public interest technology development.

The real concerns

There are also more subtle risks from too much AI-made content.

A flood of synthetic content might not pose an existential threat to the progress of AI development, but it does threaten the digital public good of the (human) internet.

For instance, researchers found a 16 per cent drop in activity on the coding website StackOverflow one year after the release of ChatGPT. This suggests AI assistance may already be reducing person-to-person interactions in some online communities.

Hyperproduction from AI-powered content farms is also making it harder to find content that isn’t clickbait stuffed with advertisements.

It’s becoming impossible to reliably distinguish between human-generated and AI-generated content. One method to remedy this would be watermarking or labelling AI-generated content, as I and many others have recently highlighted, and as reflected in recent Australian government interim legislation.

There’s another risk, too. As AI-generated content becomes systematically homogeneous, we risk losing socio-cultural diversity and some groups of people could even experience cultural erasure. We urgently need cross-disciplinary research on the social and cultural challenges posed by AI systems.

Human interactions and human data are important, and we should protect them. For our own sakes, and maybe also for the sake of the possible risk of a future model collapse. (The Conversation)

About the Author

View All Posts

Post navigation

Previous: Leather exporters’ delegation to visit Russia from Aug 26 to tap biz opportunities
Next: NCP (SP) leader Jayant Patil confronted by Maratha quota activists during yatra

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Related Stories

  • STATE

Karnataka Backward Classes Commission Warns Against Sharing Survey Data, Cites High Court Confidentiality Orders

Satya Prakash Chaubey 2 June 2026
  • STATE

Supreme Court Dismisses Tamil Nadu Plea on Mekedatu Project; DK Shivakumar Calls It ‘Good News’ for Karnataka

Satya Prakash Chaubey 26 May 2026
Bengaluru Central City Corporation Intensifies Removal of Fallen and Dangerous Trees Across City
  • Bengaluru
  • CITY UPDATES

Bengaluru Central City Corporation Intensifies Removal of Fallen and Dangerous Trees Across City

Satya Prakash Chaubey 23 May 2026

Latest Post

DK Shivakumar Keeps Finance, But Bengaluru Slips Away; Krishna Byre Gowda Gets Mega City Development Portfolio DK Shivakumar Keeps Finance, But Bengaluru Slips Away; Krishna Byre Gowda Gets Mega City Development Portfolio
  • POLITICS

DK Shivakumar Keeps Finance, But Bengaluru Slips Away; Krishna Byre Gowda Gets Mega City Development Portfolio

4 June 2026
Bengaluru Completes 72% Mapping for Special Electoral Roll Revision, Says District Election Officer Maheshwar Rao
  • Bengaluru

Bengaluru Completes 72% Mapping for Special Electoral Roll Revision, Says District Election Officer Maheshwar Rao

4 June 2026
CM D.K. Shivakumar Directs Bureaucracy to Deliver People-Centric Governance, Expand Student Benefits and Strengthen Rural Economy
  • Bengaluru
  • Government

CM D.K. Shivakumar Directs Bureaucracy to Deliver People-Centric Governance, Expand Student Benefits and Strengthen Rural Economy

4 June 2026
Bengaluru Set to Become Global Water Innovation Hub as BWSSB Plans ‘BGWIN’ Network Under CM D.K. Shivakumar’s Vision
  • Bengaluru

Bengaluru Set to Become Global Water Innovation Hub as BWSSB Plans ‘BGWIN’ Network Under CM D.K. Shivakumar’s Vision

4 June 2026
BK Hariprasad Draws Line Between Government and Party, Says ‘Those Seeking Power Can Go With DK Shivakumar’ BK Hariprasad Draws Line Between Government and Party, Says ‘Those Seeking Power Can Go With DK Shivakumar’
  • POLITICS

BK Hariprasad Draws Line Between Government and Party, Says ‘Those Seeking Power Can Go With DK Shivakumar’

4 June 2026
Shivakumar Cabinet Approves Free Bus Passes for All Students Across Karnataka Shivakumar Cabinet Approves Free Bus Passes for All Students Across Karnataka
  • Bengaluru
  • Karnataka

Shivakumar Cabinet Approves Free Bus Passes for All Students Across Karnataka

3 June 2026

You may have missed

DK Shivakumar Keeps Finance, But Bengaluru Slips Away; Krishna Byre Gowda Gets Mega City Development Portfolio
  • POLITICS

DK Shivakumar Keeps Finance, But Bengaluru Slips Away; Krishna Byre Gowda Gets Mega City Development Portfolio

Atul Chaturvedi 4 June 2026
  • Bengaluru

Bengaluru Completes 72% Mapping for Special Electoral Roll Revision, Says District Election Officer Maheshwar Rao

Satya Prakash Chaubey 4 June 2026
  • Bengaluru
  • Government

CM D.K. Shivakumar Directs Bureaucracy to Deliver People-Centric Governance, Expand Student Benefits and Strengthen Rural Economy

The Bengaluru Live 4 June 2026
  • Bengaluru

Bengaluru Set to Become Global Water Innovation Hub as BWSSB Plans ‘BGWIN’ Network Under CM D.K. Shivakumar’s Vision

The Bengaluru Live 4 June 2026

About us

The Bengaluru Live is one of the local digital media house which is publishing news in English and in Kannada language. We are also one of the largest local news providers on the internet through our news websites.

Useful Links

  • Advertise with us
  • Contact us
  • Privacy Policy

Recent News

  • DK Shivakumar Keeps Finance, But Bengaluru Slips Away; Krishna Byre Gowda Gets Mega City Development Portfolio
  • Bengaluru Completes 72% Mapping for Special Electoral Roll Revision, Says District Election Officer Maheshwar Rao
  • CM D.K. Shivakumar Directs Bureaucracy to Deliver People-Centric Governance, Expand Student Benefits and Strengthen Rural Economy
  • Bengaluru Set to Become Global Water Innovation Hub as BWSSB Plans ‘BGWIN’ Network Under CM D.K. Shivakumar’s Vision
©Copyright 2025 The Bengaluru Live All rights reserved. | MoreNews by AF themes.