this post was submitted on 26 Mar 2025
36 points (100.0% liked)

Opensource

2369 readers
49 users here now

A community for discussion about open source software! Ask questions, share knowledge, share news, or post interesting stuff related to it!

CreditsIcon base by Lorc under CC BY 3.0 with modifications to add a gradient



founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] sudo 2 points 6 days ago

while others could be executing real-time searches when users ask AI assistants for information.

WTF? Is this even considered ai anymore? Sounds more like a Just-In-Time search engine.

The frequency of these crawls is particularly telling. Schubert observed that AI crawlers "don't just crawl a page once and then move on. Oh, no, they come back every 6 hours because lol why not." This pattern suggests ongoing data collection rather than one-time training exercises, potentially indicating that companies are using these crawls to keep their models' knowledge current.

Whats telling is that these scrapers aren't just downloading the git repos and parsing those. These aren't targeted in anyways. They're probably doing something primitive like just following every link they see and getting caught in loops. If the labyrinth solution works then that confirms it.