Davriellelouna@lemmy.world to Technology@lemmy.worldEnglish · edit-22 days agoThe AI company Perplexity is complaining their bots can't bypass Cloudflare's firewallwww.searchenginejournal.comexternal-linkmessage-square231fedilinkarrow-up1834arrow-down16
arrow-up1828arrow-down1external-linkThe AI company Perplexity is complaining their bots can't bypass Cloudflare's firewallwww.searchenginejournal.comDavriellelouna@lemmy.world to Technology@lemmy.worldEnglish · edit-22 days agomessage-square231fedilink
minus-squareElectricd@lemmybefree.netlinkfedilinkEnglisharrow-up4arrow-down2·edit-218 hours agoThey do have a point though. It would be great to let per-prompt searches go through, but not mass scrapping I believe a lot of websites don’t want both though
minus-squarethreeganzi@sh.itjust.workslinkfedilinkEnglisharrow-up2·15 hours agoDoes it not need to be scraped to be indexed, assuming it’s semi-typical RAG stuff?
minus-squareElectricd@lemmybefree.netlinkfedilinkEnglisharrow-up1·14 hours agoI assume their script does some search engine stuff like query google or bing and then “scrap” the links they go on Some selenium stuff
They do have a point though. It would be great to let per-prompt searches go through, but not mass scrapping
I believe a lot of websites don’t want both though
Does it not need to be scraped to be indexed, assuming it’s semi-typical RAG stuff?
I assume their script does some search engine stuff like query google or bing and then “scrap” the links they go on
Some selenium stuff