Skip to content

Microsoft’s AI research division leaked 38TB data over three years

  • by
  • 3 min read

Microsoft’s AI researchers have inadvertently exposed about 38TB of sensitive data, including personal backups, passwords, secret keys, internal communications, and over 30,000 internal Microsoft Teams messages from 359 employees due to misconfigured Shared Access Signature (SAS) tokens. The leaked URL has exposed the data since July 2020.

Wiz Research Team was the first to point out the incident when they discovered an exposed GitHub repository belonging to Microsoft’s AI research division. The’ robust-models-transfer’ repository provided an open-source code and AI models for image recognition. However, a critical oversight allowed more than just open-source models to be accessible through the provided Azure Storage URL.

The SAS token associated with the URL granted excessive permissions, including full control access to the entire storage account.

As the repository was for providing AI models for image recognition, users were instructed to download model data files from the SAS link and utilise them in scripts. The files were in a ckpt format susceptible to arbitrary code execution, so attackers could inject malicious codes into the AI models.

A sample of leaked sensitive files. | Source: Wiz Research Team

After discovering the leak, Wiz shared the data with Microsoft on June 22, and Microsoft quickly invalidated the SAS tokens on June 24. On July 7, the SAS tokens were replaced on GitHub, and after competing internal investigations by August 16, a full public disclosure was announced on September 18.

Although the leaked data amounts to 38TB, Microsoft maintains that no customer data has been exposed and no internal services have been put at risk. Microsoft also explains that no customer action is required in response to this leak.

“After identifying the exposure, Wiz reported the issue to the Microsoft Security Response Center (MSRC) on June 22, 2023. Once notified, MSRC worked with the relevant research and engineering teams to revoke the SAS token and prevent all external access to the storage account, mitigating the issue on June 24, 2023. Additional investigation then took place to understand any potential impact on our customers and/or business continuity. Our investigation concluded that there was no risk to customers as a result of this exposure,” responded the Microsoft Security Response Center (MSRC).

In the News: APT36 expands AndroidRAT campaign using fake persona as lure

nv-author-image

Kumar Hemant

Deputy Editor at Candid.Technology. Hemant writes at the intersection of tech and culture and has a keen interest in science, social issues and international relations. You can contact him here: [email protected]

>