• tempest@lemmy.ca
    link
    fedilink
    English
    arrow-up
    17
    ·
    16 hours ago

    The reality is that depending on the crawling architecture someone is watching.

    As aggressive as the LLM crawlers are there still have limits so a competently written one will have a budget for each host/site as well as a heuristic for the quality of results. It may dig for a bit and periodically return but if you’re site is not one that is known to generate high quality data it may only get crawled when there isn’t something better in the queue.