Wikipedia Offers Some Data to Train Artificial Intelligence

Wikimedia Enterprise is now offering some of Wikipedia's datasets to companies that want to use them to train artificial intelligence (AI) models. They are working with Keggle – a Google subsidiary – to offer selected datasets in English and French.

The data has been optimized for training models by not including links and text formatting code like those offered on Wikpedia. The move to offer the dataset comes after the site's traffic was hit hard by bots trying to steal articles to train models without permission. Last month, Wikipedia said that the amount of traffic accessing multimedia content increased by 50% last year due to bot activity.

Keggle will pay Wikipedia Enterprise for the use of the data. At the same time, all data used will be given back attribution under the Creative Commons Attribution-Share-Alike 4.0 and GNU Free Documentation License (GFDL).

Trending

Wow! There are Adult Scenes in Sakura School Simulator

TNG eWallet Is Now Optimized For Global Use – Users Can Select View By Country

Google Launches Agent2Agent Protocol to Enable Collaboration Between AI Agents

TikTok Shop Malaysia Now Has Over 1.7 Million Sellers, 3 Million Users Registered as Affiliate Representatives

eMADANI RM100 Credit Redemption – What do MAE, Setel, ShopeePay and TNG eWallet Offer?

Wikipedia Offers Some Data to Train Artificial Intelligence

Contact Form