Harvard to Offer Dataset of Nearly a Million Books to Train AI



Harvard University today announced that it will be offering a high-quality dataset of nearly a million books, which anyone can use to train their own artificial intelligence models. Harvard has also received funding from Microsoft and OpenAI.


The dataset will include books categorized under the public domain, across a variety of genres and languages.


The offering is expected to serve as a foundation – and make it easier for startups to build and train their own models. However, as always, to get the most out of a model, additional training datasets are required, which developers can add over time.

Previous Post Next Post

Contact Form