Anyhow, not only is AI scraping (not scrubbing, that’s something completely different) Wikipedia, the Wikipedia licenses allow the AI companies to use the materials. Wikipedia content is licensed CC BY-SA or GFDL. So, while Wikipedia could try to block the scrapers, they can’t block the companies from using the content as long as they comply with those (very open) licenses. And, really, this is part of how I want Wikipedia to be used. Not necessarily to train up chatbots, but I want it to be a freely available, freely usable source of knowledge for the world. I like it that it isn’t knowledge that’s hidden behind some firewall. And, if chatbots are going to be trained on the contents of the Internet, at least we know that some of the training data will be good, factual knowledge, not memes, lies, propaganda, etc.
So, while I’m not happy with anything where data is being sold to the AI companies, in this case I’ll try to get over my knee-jerk reaction and see it as a good thing. Wikipedia gets paid for something that was already freely available, and maybe the jazzed-up autocomplete will more frequently autocomplete from a good source.
Um… it does what?
Anyhow, not only is AI scraping (not scrubbing, that’s something completely different) Wikipedia, the Wikipedia licenses allow the AI companies to use the materials. Wikipedia content is licensed CC BY-SA or GFDL. So, while Wikipedia could try to block the scrapers, they can’t block the companies from using the content as long as they comply with those (very open) licenses. And, really, this is part of how I want Wikipedia to be used. Not necessarily to train up chatbots, but I want it to be a freely available, freely usable source of knowledge for the world. I like it that it isn’t knowledge that’s hidden behind some firewall. And, if chatbots are going to be trained on the contents of the Internet, at least we know that some of the training data will be good, factual knowledge, not memes, lies, propaganda, etc.
So, while I’m not happy with anything where data is being sold to the AI companies, in this case I’ll try to get over my knee-jerk reaction and see it as a good thing. Wikipedia gets paid for something that was already freely available, and maybe the jazzed-up autocomplete will more frequently autocomplete from a good source.
That is a reasonable and sound response. I also concur.