Skip to content

X’s data policy to train its AI seems like a step in the wrong direction

  • by
  • 7 min read
Photo: Bluecat_stock /

Photo: Bluecat_stock /

X, formerly Twitter, recently updated its privacy policy, informing users that it plans to use the information collected from users and other publicly available data to help train its machine learning and AI models. 

The news comes shortly after another update to X’s privacy policy informing users that it would now collect biometric data and users’ job and education history. 

Using publicly available data to train AI models isn’t anything new, with the most notable example being ChatGPT creator OpenAI doing the same for its GPT-3.5 and GPT-4 AI models. However, learning from past mistakes, OpenAI did license at least a portion of its datasets to avoid copyright infringements down the line. 

While publicly available data is free for everyone to access and use, training AI and ML models pose a few challenges that can get tricky to solve. With X already going through a barrage of changes in the year since Elon Musk’s controversial $44 billion takeover, collecting information and using it to train AI models might not be the best way forward. 

Also read: Apple iPhone 15 comes out next week: All you need to know

What does X’s new privacy policy say?

It all started with X changing its privacy policy to inform users that it’s expanding the amount of data it collects on users to include “biometric information” and “employment history”, as first spotted by Bloomberg

Here’s what the updated privacy policy had to say about biometric information. Note how the policy doesn’t include details on what kind of biometric information X will collect or how.

Based on your consent, we may collect and use your biometric information for safety, security, and identification purposes.

This also doesn’t include why X wants this data, but one possibility can be passwordless sign-ins. The platform also plans to roll out passkeys support, letting users sign into their accounts with their device’s fingerprint reader, facial recognition, or a PIN code. It has already caused privacy concerns among users.

Photo: Trismegist san /
X will soon start collecting your biometric information. | Photo: Trismegist san /

And here’s what X collects on you regarding job applications, recommendations and employment history in general. 

We may collect and use your personal information (such as your employment history, educational history, employment preferences, skills and abilities, job search activity and engagement, and so on) to recommend potential jobs for you, to share with potential employers when you apply for a job, to enable employers to find potential candidates, and to show you more relevant advertising.

Shortly after, Alex Ivanovs from Stackdiary found another policy change in section 2.1 stating. 

We may use the information we collect and publicly available information to help train our machine learning or artificial intelligence models for the purposes outlined in this policy.

He further pointed out a tweet (now called a post) where Elon’s asking journalists that if they want “more freedom to write and a higher income”, they should publish directly on X. This is essentially a call for creators to publish helpful information exclusively on X so it can be further used to train its (and its subsidiaries) models. 

This new policy, called the “X Privacy Policy”, goes into effect on September 29, 2023. Until then, the existing policy called the “Current Privacy Policy” will remain in place. Both policies are available on the platform’s website now, allowing users to review and compare changes.

Will other Musk-headed companies benefit from X’s dataset?

Most likely, yes. 

Musk’s latest startup, an AI company called xAI, will use X’s data for training its “maximally curious” AI systems and products — something the multi-billionaire already confirmed in a Twitter Space when sharing more information about the upcoming venture. 

Interestingly, xAI will also be collaborating with Tesla when it comes to both hardware and AI-related software. This isn’t anything new as several other Musk-headed companies, including Tesla, SpaceX and The Boring Company, have been in business with each other in the past, with some of these transactions coming to light in Tesla’s 2020 filing with the US Securities and Exchange Commission. 

Musk has also accused “every AI organisation on Earth” of using Twitter’s data for training and “in all cases illegally.” He hasn’t cited any evidence for his claims, but X has since been implementing rate limits to prevent the platform from “being scraped like crazy.”

All of this was more or less used as justification for xAI also using the micro-blogging (soon to be the “everything app”) platform’s data for training. Musk has been careful to state that the company will only use public tweets and nothing private “just like everyone else has.”

As the pieces fall into place, you’ll notice a clear link between all companies under Musk’s purview. X will use its users’ data for training AI models, and so will xAI — which is already in cahoots with Tesla. By extension, it’s not very difficult for any of Musk’s companies to have access to training data and likely other, more sensitive information collected by X. 

We’re not saying Musk has turned X into a training data farm for his AI enterprises, but he seems to have found a convenient source. All of this doesn’t quite go well together, considering Musk was one of the leaders wanting to stop AI development past GPT-4.

Also read: Is Zoho’s “privacy-focused” Ulaa browser good?

Should you be worried?

Yes. As an everyday user of the platform, apart from handing over biometric data and other sensitive information, which will likely be used as cannon fodder for Muks’s other AI enterprises, the new privacy policy also states that other than your interactions with other users’ content like comments, shares and likes as well as the content you post, it also collects.

How you interact with others on the platform, such as people you follow and people who follow you, metadata related to Encrypted Messages, and when you use Direct Messages, including the contents of the messages, the recipients, and date and time of messages.

Collecting some data is unavoidable, but will this metadata be used to train X’s AI models? Well, Musk says no. 

There is no mention of what AI models X will train using the data it harvests. With the definition of metadata being somewhat vague in the privacy policy, the company needs to elaborate on the type of metadata it’ll collect to put sceptics’ minds at ease. 

Meanwhile, on Friday, X updated its terms of service to ban scraping or crawling data to prevent AI models from training on the data.

Additionally, Ivanovs also highlighted that according to the updated terms, there’s a chance that users’ access to certain content might get limited or even cut off. Users might also find it harder to see their content from a wider audience. 

If this sounds like a shadow ban, you’re right. It’s basically the same concept, however, with AI and automation coming into the picture, it will get much more complex and prevalent. 

As mentioned, this new privacy policy, called the “X Privacy Policy”, goes into effect on September 29, 2023. It’s nice of X to have both the “current” and the new privacy policies available side by side for comparison, but if you’re willing to comb through the rather long and tedious read, there are many worrying things to find. 

Musk has done pretty much what he wanted to do with the platform, pushing changes, slashing jobs, and even changing the platform’s name and logo with no regard for opposition. With the company now training AI models on your posts, sharing your data with more third parties and collecting more information in the first place, it’s time to be more careful than ever about what data you put in X’s hands. 

Also read: Digital Personal Data Protection Bill: Critical Analysis


Yadullah Abidi

Yadullah is a Computer Science graduate who writes/edits/shoots/codes all things cybersecurity, gaming, and tech hardware. When he's not, he streams himself racing virtual cars. He's been writing and reporting on tech and cybersecurity with websites like Candid.Technology and MakeUseOf since 2018. You can contact him here: [email protected].