Giveaway: SUBSCRIBE our youtube channel to stand a chance to win an iPhone 17 Pro

OpenAI Allegedly Trains GPT-4 Using Unauthorized YouTube Video Transcripts

bythecekodok -April 08, 2024

To train a new large-scale model (LLM), a lot of data is needed. But many companies doing training to develop their latest AI are starting to face the issue of getting quality data. According to a Wall Street Journal report, OpenAI has used data from 1 million YouTube videos without permission to train GPT-4.

OpenAI is said to use Whisper, an AI that produces video transcripts. Data. which Whisper collects is then used to train GPT-4. This is a violation of YouTube's terms and conditions and is using the intellectual property of the creators on the site without permission.

In a statement given to The Verge, OpenAI said they use data from various public sources and also through collaborations for data that is not publicly provided.

According to the WSJ, Google also does the same thing but only uses certain YouTube videos according to the terms and conditions agreed by the owner.

Last week YouTube's CEO said that if OpenAI uses YouTube videos to train Sora, it violates the site's terms and conditions. OpenAI has never admitted to using YouTube videos to train Sora. However, the issue of copyright to train AI is a heated issue last year. Several lawsuits have been filed by prominent authors and media companies against OpenAI for allegedly training AI using their work without permission.

Tags CYBER LIFE

Trending

Wow! There are Adult Scenes in Sakura School Simulator

10 Most Interesting Science News of 2025

Grab Introduces 13 New Features for Southeast Asia Market Including Cash Loan and Grab More

CelcomDigi Introduces Special Hajj Season Roaming Pass – 50 Days for RM148

Geely Battery Technology Offers World's Fastest Charging - 10% to 70% in Less Than 5 Minutes

OpenAI Allegedly Trains GPT-4 Using Unauthorized YouTube Video Transcripts

Contact Form