x
N A B I L . O R G
Close
Technology - August 12, 2025

Reddit Limits Access to Internet Archive Due to AI Companies Scraping User Data

Reddit Limits Access to Internet Archive Due to AI Companies Scraping User Data

Tech giant Reddit has imposed restrictions on the Internet Archive’s access to its platform following the discovery that several AI companies were utilizing the Wayback Machine to extract user data without charge, according to multiple reports.

The Internet Archive, a non-profit digital library dedicated to preserving web content and promoting universal access to knowledge, has found itself in hot water after it was revealed that some AI firms violated Reddit’s platform policies by scraping data from the Wayback Machine.

Reddit officials declined to name the offending companies but stated that measures were being taken to prevent the Wayback Machine from serving as an enabler of such activities. Effective immediately, the digital library will no longer be permitted to crawl post detail pages, user comments, or profiles. It will only be able to archive Reddit’s homepage, thus limiting visitor access to top posts from a given day.

Reddit has informed the Internet Archive of these restrictions and intends to maintain them until the digital library can demonstrate its ability to adhere to platform policies, particularly in relation to user privacy and content deletion.

In recent years, Reddit has demonstrated a willingness to monetize its user data, as evidenced by its annual $60 million licensing agreement with Google and a similar arrangement with OpenAI. However, the company recently took legal action against Anthropic over allegations that its bots accessed the platform without authorization on over 100,000 occasions.

The Internet Archive remains optimistic about resolving this issue amicably. Mark Graham, director of the Wayback Machine, expressed hope for ongoing dialogue between the two parties regarding this matter.