NextFin News - A quiet but highly controversial experiment is unfolding across urban Indian households, where domestic gig workers are being transformed into human data-gatherers to train the next generation of global humanoid robots. At the center of this push is Human Archive, a Y Combinator-backed startup founded by UC Berkeley and Stanford dropouts, including Raj Patel. Patel and his co-founders are betting that India’s vast, low-cost service economy can solve the single greatest bottleneck in artificial intelligence: the lack of real-world physical data.
The race to build "Physical AI"—the brainpower behind humanoids developed by frontier labs like Figure AI, 1X, and Physical Intelligence—has hit a wall. While large language models have successfully ingested the digital world, robots require high-fidelity, multimodal datasets of physical human actions to learn how to navigate, clean, cook, and manipulate objects. To bridge this gap, Human Archive is building a massive database of human motion, depth mapping, and egocentric video by partnering with on-demand home services platforms.
The commercial potential of this strategy has already ignited venture capital interest. Pronto, a Bengaluru-based home services startup, recently closed a $45 million Series B funding round at a $200 million valuation, co-led by General Catalyst and Glade Brook Capital, with participation from Lachy Groom. While Pronto’s chief executive, Anjali Sardana, has publicly focused on scaling the platform's core services, an internal memo from Glade Brook Capital, first reported by Entrackr, revealed a far more ambitious secondary play. According to the memo, Pronto is actively "developing a data business leveraging its workforce to capture real-world household data for robotics labs" and is already piloting real-world training data with leading physical AI labs.
This data-harvesting model is rapidly spreading across the sector. Snabbit, a direct competitor to Pronto, confirmed it was approached by Human Archive earlier this year to explore a similar data-collection partnership. Although Snabbit eventually walked away from the proposal after conducting a physical assessment of Human Archive’s technology in a controlled environment, the outreach underscores how aggressively robotics startups are targeting India's service platforms.
For the startups involved, the economics are highly compelling. Selling high-fidelity physical data to well-funded global robotics labs offers a high-margin revenue stream that can subsidize the low-margin, highly competitive business of home cleaning and repair. For the gig workers, Human Archive has even opened waitlists for remote video review work, offering a temporary digital side hustle.
Yet, this model has triggered immediate pushback from both labor advocates and privacy regulators. Critics point out a stark economic irony: low-wage Indian domestic workers are effectively being paid to train the very machines that could eventually render their jobs obsolete.
More pressing, however, are the immediate risks of data spills and privacy violations. Sending camera-equipped workers into private homes to record daily chores inevitably captures intimate household layouts, personal belongings, and potentially sensitive private conversations. According to The Economic Times, India’s Ministry of Electronics and Information Technology has taken note of these developments, particularly around Pronto’s in-home recording pilots, as safety experts raise alarms over the lack of clear safeguards.
While Pronto has claimed that all collected footage is deleted within 48 hours and remains inaccessible to anyone other than the customer, the lack of independent verification has done little to ease regulatory anxieties. The tension between the insatiable data demands of global AI developers and the sovereign privacy rights of citizens is setting up a major regulatory battleground in New Delhi. The outcome of this clash will determine whether India’s gig economy becomes the foundational infrastructure for global robotics, or if regulatory barriers will shut down the physical AI pipeline before it can fully scale.
Explore more exclusive insights at nextfin.ai.
