Skip to content

Tumblr and WordPress prepare to sell user data for AI training

  • by
  • 3 min read

In a privacy controversy, it was disclosed that Tumblr and WordPress.com, both owned by Automattic, are reportedly gearing up to sell user data to AI companies Midjourney and OpenAI.

While the exact nature of the data and the deal’s specifics remain unclear, internal documents and communications reviewed by 404 Media indicate that these transactions are imminent.

According to an internal post by Cyle Gage, a product manager at Tumblr, a query designed to prepare data for OpenAI and Midjourney unintentionally compiled many user posts that it wasn’t supposed to. The compiled data reportedly included private posts on public blogs, deleted or suspended blogs, unanswered asks, private answers, posts from blogs of premium partners, including Apple, and potentially explicit or mature content.

Gage’s post suggests that engineers are actively working to rectify this issue by compiling a list of post IDs that should now have been included.

The parent company, Automattic, plans to introduce a new setting on Wednesday that allows users to opt out of data sharing with third parties, including AI companies. This move comes in response to the growing concerns over user data privacy.

The company aims to block crawlers from accessing content for users who opt out and intends to inform partners about users who newly opt out, requesting the removal of their content from past sources and future training.

The company released a statement titled ‘Protecting user choice‘, declaring, “Like other tech companies, we’re closely following these advancements, including how to work with AI companies in a way that respects our users’ preferences.”

The statement highlights Automattic’s collaboration with select AI companies, aligning their plans with community values of attribution, opt-outs, and control. However, the statement does not clarify whether self-hosted WordPress blogs using Automattic plugins, like Jetpack, are subject to AI-scrapping deals. The company is silent on this issue.

Another internal document dated February 23 reveals an employee’s query about notifying existing data partners when a user opts out of data sharing. Responding to the query, Automatic’s head of AI, Andrew Spittle, expressed the company’s intention to advocate for past content exclusion based on current preferences and seeking deletion and removal from future training runs.

The rumours about Tumblr’s deal with the AI companies began circulating when a Tumblr employee wrote a post about the deal.

Several companies, including Shutterstock and Reddit, have finalised a deal with AI companies. Currently, no law prohibits companies from signing such deals and as such, employees and netizens are still unsure about the issue.

In the News: Nvidia implements ads on free tier GeForce Now cloud gaming

Kumar Hemant

Kumar Hemant

Deputy Editor at Candid.Technology. Hemant writes at the intersection of tech and culture and has a keen interest in science, social issues and international relations. You can contact him here: kumarhemant@pm.me

>