Skip to content

OpenAI denies stealing Indian media data for ChatGPT training

  • by
  • 3 min read

Photo: Camilo Concha/Shutterstock.com

OpenAI has filed a legal response seeking to prevent major Indian media groups, including those linked to business magnates Gautam Adani and Mukesh Ambani, from joining an ongoing copyright lawsuit. The Microsoft-backed AI company does not use these media outlets’ content to train ChatGPT.

The case stems from a lawsuit filed last year by Indian news agency ANI, which accused OpenAI of using its published content without authorisation to train its artificial intelligence chatbot.

Delhi HC then appointed two experts in ANI vs OpenAI case. On the other hand, OpenAI requested Delhi High Court to dismiss copyright allegations.

Since then, several Indian media organisations, including Adani’s NDTV, the Indian Express, the Hindustan Times, and the Digital News Publishers Network (DNPA) — which represents Network18 and other outlets — have sought to join the case, alleging that OpenAI scrapes their websites and reproduces their work without permission.

However, OpenAI’s February 11 filing documents, which were reviewed by Reuters, stated that it has not used content from these organisations to train its AI models. The company also argues that it is under no obligation to enter licensing agreements with these media houses for content that is publicly accessible.

Photo: tada images / shutterstock. Com
This case is part of a wider phenomenon where news publishers sued AI firms for data scrapping. | Photo: Tada Images / Shutterstock.com

In its filing, OpenAI reiterated its stance that it builds AI models using publicly available data in compliance with fair use principles and established legal precedents. The company also emphasised that its existing global partnerships with news publishers do not necessarily involve licensing agreements for AI training.

The argument counters allegations from Indian media groups that OpenAI has not extended similar licensing deals in India as in other countries. Instead, OpenAI asserts that under the Indian copyright law, publicly available content is legally permissible.

The case is part of a larger wave of legal challenges worldwide, where authors, news organisations, and musicians have accused AI firms of leveraging copyrighted material without permission or compensation. While OpenAI has entered into agreements with various media outlets such as Hearst, Conde Nast, The Atlantic, Vox Media, and News Corp, the lack of similar agreements in India has become a point of contention.

The copyright dispute is set to be heard in New Delhi next week. Meanwhile, OpenAI CEO Sam Altman recently visited India, meeting with the country’s IT minister to discuss plans for a low-cost AI ecosystem.

In the News: Is Krutrim good enough to lead India into the AI age?

Kumar Hemant

Kumar Hemant

Deputy Editor at Candid.Technology. Hemant writes at the intersection of tech and culture and has a keen interest in science, social issues and international relations. You can contact him here: kumarhemant@pm.me

>