Meta AI Model Trained on Pirated Works



Artificial intelligence (AI) models developed by Meta were trained using data from pirated works. In a lawsuit filed by a group of authors against Meta, there is evidence that Meta was aware that the data used was from pirated sources.


The dataset from Library Genesis (LibGen), which contains millions of pirated works, was allowed to be used to train AI Llama by Mark Zuckerberg himself. The lawsuit was filed by novelists such as Richard Kadrey and Christopher Golden and comedian Sarah Silverman. Meta previously said that training data using publicly accessible data should not be seen as violating copyright law.


Library Genesis is a digital library project that contains 2.4 million non-fiction books, 2.2 million fiction books, 80 million science magazine articles, 2 million comics and 0.4 million magazines. The LibGen site has been blocked and shut down several times for offering free access to pirated works. The pirate library was ordered to pay $30 million in damages to publishers last year, although no one knows who operates it.

Previous Post Next Post

Contact Form