Artificial Intelligence platforms, much like puppies, require training before they can operate effectively. This involves feeding specific data to algorithms so that the system can provide accurate responses. In April, we shared that Apple was considering spending $50 million to license content from media companies like NBC News, Condé Nast, and IAC for AI training.
Recently, it has come to light that Apple and other companies utilized content from YouTube videos to train AI models without obtaining permission from the creators. A third party compiled subtitles from over 170,000 videos, including content from tech reviewer Marques Brownlee (MKBHD) and late-night hosts Stephen Colbert and Jimmy Kimmel.
According to WIRED, Silicon Valley firms such as Anthropic, Nvidia, Apple, and Salesforce used subtitles from 173,536 YouTube videos downloaded by a firm named EleutherAI to assist developers in training AI models. The aim was to develop training materials for small developers and academics.
Despite the dataset created by EleutherAI called YouTube Subtitles lacking imagery and only featuring plain text of video subtitles along with translations into languages like Japanese, German, and Arabic, concerns arose as creators were not asked for permission to use their content for AI training.
While lawsuits have been filed against some members of the AI community for unauthorized content usage, companies like Open AI and Meta defended their actions citing the Fair Use doctrine permitting unlicensed use of copyrighted material in certain circumstances.