Someone Made a Dataset of One Million Bluesky Posts for 'Machine Learning Research'

Nexy@lemmy.sdf.org · edit-2 1 month ago

Someone Made a Dataset of One Million Bluesky Posts for 'Machine Learning Research'

KurtVonnegut@mander.xyz · 1 month ago

The same can and will happen with the Fediverse right?

GeneralEmergency@lemmy.world · 1 month ago

Probably already happened

Viking_Hippie@lemmy.world · 1 month ago

deleted by creator

KurtVonnegut@mander.xyz · 1 month ago

I see. Probably mastodon.social gets scraped, then 🫣

ladicius@lemmy.world · 30 days ago

Is that a problem for a proper scraper? Give the machine a list of domains and some hints about the relevant protocols, and then the computer runs until the end of the list.