According to a report by 404 Media, Nvidia admitted to using a large amount of training data from various sources for its AI projects. Former employees revealed that they were asked to scrape video content from platforms like Netflix and YouTube to gather training data for Nvidia’s AI products including Omniverse 3D, self-driving cars, and digital human development.
Despite concerns about the legality of the project named “Cosmos,” Nvidia management assured employees that they had obtained clearance from top company officials to use the scraped content.
Get your weekly breakdown of the technology behind PC gaming
The Cosmos project aimed to create a foundational model similar to Gemini 1.5, GPT-4, or Llama 3.1, to enable various applications crucial to Nvidia. It used open-source tools and machine learning techniques to download vast amounts of video content daily.
Nvidia defended its actions, stating that they complied with copyright laws and fair use principles. However, this incident is not the first time Nvidia has been involved in controversial scraping practices in the AI field.
At CES 2024, Nvidia faced scrutiny over its generative AI for gaming engine, but maintained that its tools were legally sound.