NextFin News - A collaborative study by researchers from Stanford University, Imperial College London, and the Internet Archive has revealed that approximately 35% of websites created between late 2022 and mid-2025 are either fully AI-generated or significantly AI-assisted. The findings, published in a paper titled "The Impact of AI-Generated Text on the Internet," provide the first large-scale empirical evidence for the "Dead Internet Theory," which posits that the web is increasingly dominated by bot-generated content rather than human interaction.
The research team, led by data scientists including Max Spero, utilized the Pangram v3 detection tool to analyze a representative sample of URLs archived by the Wayback Machine. The data shows a vertical climb in synthetic content from near-zero levels prior to the launch of ChatGPT in November 2022 to more than one-third of the new digital landscape by April 2026. While the study confirms a massive influx of AI text, it also notes a shift in the "emotional climate" of the web, finding that AI-generated sites tend to exhibit a more "aggressively positive" sentiment compared to human-authored content.
Max Spero, a researcher associated with the project, has previously focused on the intersection of machine learning and digital forensics. His work often highlights the structural risks of automated content systems, though he maintains that the technology itself is a neutral tool whose impact depends on deployment. Spero’s perspective leans toward caution regarding the "semantic degradation" of the web, a view that is gaining traction among digital archivists but remains a point of debate among Silicon Valley optimists who view AI as a productivity multiplier for web development.
The study’s conclusions do not currently represent a universal consensus among search engine giants or digital marketing agencies. While the 35% figure is striking, some industry analysts argue that the detection of "AI-assisted" content may capture legitimate human-led workflows that use AI for grammar correction or structural outlines. Critics of the Dead Internet Theory, such as those within the SEO (Search Engine Optimization) community, suggest that as long as the content provides utility to the user, the origin of the text is secondary to its accuracy—a metric the Stanford study found has not yet seen a statistically significant decline despite the volume of synthetic output.
The economic implications of this shift are already surfacing in the digital advertising market. If a third of new websites are synthetic, the traditional metrics of "human traffic" and "engagement" become increasingly difficult to verify, potentially devaluing ad inventory on newer domains. The researchers observed that while factual accuracy has held steady for now, the "semantic diversity" of the internet is narrowing, as AI models tend to converge on similar linguistic patterns and "cheery" tones, effectively creating a feedback loop of homogenized information.
The reliability of these findings hinges on the continued accuracy of detection tools like Pangram v3, which must evolve as large language models become more sophisticated at mimicking human idiosyncrasies. If AI models begin to successfully bypass these detectors, the true proportion of synthetic websites could be significantly higher than the reported 35%. Conversely, if search engines like Google successfully de-index low-quality "AI farms," the incentive to create such sites may diminish, potentially reversing the trend observed in the study.
Explore more exclusive insights at nextfin.ai.
