Wikipedia owner signs on Microsoft, Meta in AI content training deals

Stern@lemmy.world · 17 hours ago

Wikipedia owner signs on Microsoft, Meta in AI content training deals

merc@sh.itjust.works · 13 hours ago

My guess is that this was necessary because the AI companies already downloaded the offline versions of Wikipedia. But, they think they can one-up their competition by having “fresher data” so they either hammer the download servers and download the 25 GB full offline version multiple times a day, just in case it changed. Or, they might crawl and scrape Wikipedia so they get the data before it makes it into the daily offline version, or something.

It wouldn’t be hard for Wikipedia to provide them a feed of the changes going to the Wikipedia database so they get the data as fresh as it can possibly be. Plus, doing this most likely reduces the antisocial behaviours that the AI companies would otherwise engage in to get their fresh data. Win, win. Even if it sucks to give these AI companies a win.

Wikipedia owner signs on Microsoft, Meta in AI content training deals

Wikipedia owner signs on Microsoft, Meta in AI content training deals

reuters.com