Skip to content

Microsoft investigates DeepSeek’s alleged OpenAI data misuse

  • by
  • 3 min read

Microsoft is probing a suspected case of unauthorised OpenAI data extraction linked to the popping kid on the AI block — DeekSeek, a China-based AI startup. Ironically, OpenAI has been accused of copyright violations by various media outlets in the United States and other countries, including India and France.

Microsoft’s security team detected what they believe to be individuals connected to DeepSeek exfiltrating amounts of data via OpenAI’s applications programming interface (API) last fall.

The API, which software developers can license to integrate OpenAI’s proprietary AI models into their applications, appears to have been used in a manner that could potentially violate OpenAI’s terms of service.

According to Bloomberg, investigators are considering whether DeepSeek employed methods to bypass OpenAI’s usage restrictions, thereby accessing more data than permitted.

Microsoft, OpenAI’s largest investor and strategic technology partner, flagged the issue to OpenAI, raising concerns over potential data misuse. Neither company has officially commented on the specifics of the investigation. DeepSeek and its affiliated hedge fund, High-Flyer, have also not responded to inquiries.

The controversy gains significance in light of DeepSeek’s recent release of R1, an open-source AI model that claims to rival and, in some cases, outperform models developed by U.S. tech giants such as OpenAI, Google, and Meta.

This is an image of deepseek featured 1
DeepSeek challenged the dominance of US companies in the artificial intelligence software sector.

DeepSeek asserts that R1 was developed at a fraction of the cost of its Western counterparts while achieving superior performance on benchmarks assessing mathematical reasoning and general knowledge.

Adding to the controversy, David Sacks, President Donald Trump’s AI czar, stated on Fox News that there is “substantial evidence” that DeepSeek leveraged OpenAI’s outputs to develop its technology. Sacks referred to a process known as distillation, wherein an AI model is trained using the outputs of another to replicate its capabilities. However, he did not provide specific details to substantiate these claims.

In response to Sacks’ allegations, OpenAI did not directly address the accusations but acknowledged the broader threat posed by AI model distillation.

“As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe as we go forward that it is critically important that we are working closely with the U.S. government to best protect the most capable models from efforts by adversaries and competitors to take US technology,” OpenAI said.

In the News: Apple iPhones to get Starlink connectivity soon

Kumar Hemant

Kumar Hemant

Deputy Editor at Candid.Technology. Hemant writes at the intersection of tech and culture and has a keen interest in science, social issues and international relations. You can contact him here: kumarhemant@pm.me

>