One-third of new websites on the internet are AI-generated. Has the era arrived where it's difficult to discern lies unless you're someone who can see through them?



Researchers analyzing data from the Internet Archive found that by mid-2025, approximately one-third of newly created websites would be classified as containing AI-generated or AI-assisted text. The study was conducted by researchers from Stanford University, Imperial College London, and the Internet Archive, and the findings have been published in a paper titled 'The Impact of AI-Generated Text on the Internet.'

The Impact of AI-Generated Text on the Internet

https://ai-on-the-internet.github.io/



Study Finds A Third of New Websites are AI-Generated
https://www.404media.co/study-finds-a-third-of-new-websites-are-ai-generated/


According to the research team, by mid-2025, approximately 35% of newly published websites were identified as containing AI-generated or AI-assisted text. Since similar sites were almost nonexistent before ChatGPT was made publicly available at the end of 2022, this indicates that AI-generated and AI-assisted text has spread rapidly across the web in just about three years.



The research is based on the idea known as the '

dead internet theory ,' which suggests that much of the conversation and articles on the internet are created not by humans, but by bots or automatically generated content. Inspired by the dead internet theory, the research team investigated how much text has been replaced by AI-generated or AI-assisted text due to the proliferation of ChatGPT and competing AI-generating services, and what impact this is having on text on the web as a whole.

The study used a sample of websites published between August 2022 and May 2025, a period of 33 months. The research team extracted the web page text by obtaining HTML from the oldest snapshots stored in the Wayback Machine. They then compared several AI detection tools and used 'Pangram v3,' which showed the most stable results, to determine whether the text was AI-generated or AI-assisted.

The research team examined six common concerns regarding AI-generated and AI-assisted text. Specifically, these concerns included: 'Will the range of perspectives and content narrow?', 'Will misinformation increase?', 'Will the writing become overly cheerful and bland?', 'Will source links decrease?', 'Will there be an increase in texts with little information?', and 'Will individual writing styles disappear, leaving only uniform texts?'

The analysis revealed only two clearly supported concerns: 'a narrowing of perspectives and content' and 'overly cheerful and bland writing.' The study found that as AI-generated and AI-assisted text increases, web texts tend to become semantically similar, and there is a general trend towards more positive expressions. On the other hand, no clear impact was observed on the number of misinformation sources, the density of external links, information density, or stylistic individuality.



Researcher Jonas Dreshal cited the most surprising finding as the failure to confirm the hypothesis that 'AI-generated and AI-assisted text increases verifiable misinformation.' However, the possibility remains that there is an increase in claims that are difficult to verify with existing fact-checking methods, or that the internet was never a place faithful to facts from the beginning.

The research team plans to collaborate with the Internet Archive to create a tool that will continuously measure the impact of AI-generated and AI-assisted text on the web. By analyzing the impact by category and language, they aim to investigate in detail which types of websites are most strongly affected by AI-generated and AI-assisted text.

in AI, Posted by log1d_ts