Skip to content

Open-source IndiaAI Datasets Platform will launch in January 2025

  • by
  • 3 min read

The IndiaAI Datasets Platform, a cornerstone of the government’s Rs. 10,000 crore IndiaAI Mission, is set to go live by January 2025. Announced on Wednesday, this initiative aims to provide an AI development ecosystem similar to that of HuggingFace, a U.S.-based open-source collaborative platform for sharing AI models and datasets.

First reported by The Economic Times, the IndiaAI Datasets Platform will house data from central and state governments, as well as private entities, enabling developers and researchers to create, train, and deploy AI models.

Nand Kumarum, CEO of the National eGovernance Division (NeGD), explains that the initiative will try to create an Indian HuggingFace model. HuggingFace is also known as the GitHub of machine learning as it allows developers to create and test ML models.

“The idea primarily is like HuggingFace — you have models, you have datasets, and you have people coming up and using those datasets and building models. We are trying to do something similar,” Kumarum said.

Although the platform is still in its early stages, the goal is to establish a repository of datasets and AI models that could be leveraged by developers across sectors. The skeletal framework of the platform is expected to be operational by the end of January 2025.

This is an image of huggingfacefeatured ss1
IndiaAI Datasets Platform is open-source, like HuggingFace, and aims to provide researchers with datasets and repositories.

However, Kumarum acknowledged that building a large-scale repository on par with global platforms like HuggingFace would take time.

The IndiaAI Datasets Platform will integrate data from multiple sources, including central and state governments and the private sector. The initiative is part of a broader effort to accelerate AI adoption in India and make the country a hub for AI development.

While private sector partnerships are still being finalised, Jumarum emphasised that work is progressing in that direction.

This platform is one of the seven pillars under the IndiaAI Mission launched by the Ministry of Electronics and Information Technology (MeitY) in March 2024.

As MediaNama reports, the Ministry of Science and Technology unveiled BharatGen, a government-funded large language model (LLM) catering to Indian languages. At least two private entities, Tech Mahindra and Ola, are also working on developing LLMs.

Kumarum highlighted the potential applications of generative AI within government operations. He pointed out that AI could streamline processes such as drafting requests for proposals (RFPs) or schemes by learning from past documents. This would improve productivity and provide a more efficient way to compare policy frameworks across states.

In the News: Apple products support law enforcement amid surveillance concerns

Kumar Hemant

Kumar Hemant

Deputy Editor at Candid.Technology. Hemant writes at the intersection of tech and culture and has a keen interest in science, social issues and international relations. You can contact him here: kumarhemant@pm.me

>