I wonder how effective they are. When I first heard about ssh targets (like endlessh) I thought it was an awesome idea. But as I started to look at some analyzed logged data it turns out they are either slightly effective to not at all effective. If simple logic can be written so a dumb ssh bot programed to find vulnerable ssh servers can easily avoid a tar pit, I would think it is pretty trivial for an AI crawler to do the same thing. I am interested to see some analyzed data on something like this after several months on the open internet.
The reality is that depending on the crawling architecture someone is watching.
As aggressive as the LLM crawlers are there still have limits so a competently written one will have a budget for each host/site as well as a heuristic for the quality of results. It may dig for a bit and periodically return but if you’re site is not one that is known to generate high quality data it may only get crawled when there isn’t something better in the queue.
I wonder how effective they are. When I first heard about ssh targets (like endlessh) I thought it was an awesome idea. But as I started to look at some analyzed logged data it turns out they are either slightly effective to not at all effective. If simple logic can be written so a dumb ssh bot programed to find vulnerable ssh servers can easily avoid a tar pit, I would think it is pretty trivial for an AI crawler to do the same thing. I am interested to see some analyzed data on something like this after several months on the open internet.
The reality is that depending on the crawling architecture someone is watching.
As aggressive as the LLM crawlers are there still have limits so a competently written one will have a budget for each host/site as well as a heuristic for the quality of results. It may dig for a bit and periodically return but if you’re site is not one that is known to generate high quality data it may only get crawled when there isn’t something better in the queue.