this post was submitted on 05 Mar 2024
172 points (95.3% liked)

DeGoogle Yourself

8845 readers
51 users here now

A community for those that would like to get away from Google.

Here you may post anything related to DeGoogling, why we should do it or good software alternatives!

Rules

  1. Be respectful even in disagreement

  2. No advertising unless it is very relevent and justified. Do not do this excessively.

  3. No low value posts / memes. We or you need to learn, or discuss something.

Related communities

[email protected] [email protected] [email protected] [email protected] [email protected] [email protected]

founded 4 years ago
MODERATORS
 

In an age of LLMs, is it time to reconsider human-edited web directories?

Back in the early-to-mid '90s, one of the main ways of finding anything on the web was to browse through a web directory.

These directories generally had a list of categories on their front page. News/Sport/Entertainment/Arts/Technology/Fashion/etc.

Each of those categories had subcategories, and sub-subcategories that you clicked through until you got to a list of websites. These lists were maintained by actual humans.

Typically, these directories also had a limited web search that would crawl through the pages of websites listed in the directory.

Lycos, Excite, and of course Yahoo all offered web directories of this sort.

(EDIT: I initially also mentioned AltaVista. It did offer a web directory by the late '90s, but this was something it tacked on much later.)

By the late '90s, the standard narrative goes, the web got too big to index websites manually.

Google promised the world its algorithms would weed out the spam automatically.

And for a time, it worked.

But then SEO and SEM became a multi-billion-dollar industry. The spambots proliferated. Google itself began promoting its own content and advertisers above search results.

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

My question is, if a lot of the web is turning to crap, do we even want to search the entire web anymore?

Do we really want to search every single website on the web?

Or just those that aren't filled with LLM-generated SEO spam?

Or just those that don't feature 200 tracking scripts, and passive-aggressive privacy warnings, and paywalls, and popovers, and newsletters, and increasingly obnoxious banner ads, and dark patterns to prevent you cancelling your "free trial" subscription?

At some point, does it become more desirable to go back to search engines that only crawl pages on human-curated lists of trustworthy, quality websites?

And is it time to begin considering what a modern version of those early web directories might look like?

@degoogle #tech #google #web #internet #LLM #LLMs #enshittification #technology #search #SearchEngines #SEO #SEM

(page 2) 33 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 2 points 8 months ago

@ajsadauskas @degoogle hopefully they don't look like Dmoz, because i still have unpleasant flashbacks of that dark time ๐Ÿ˜‹

[โ€“] [email protected] 2 points 8 months ago

@ajsadauskas @degoogle a bit of history of Yahoo here, started as a web directory https://www.wired.com/1996/05/indexweb/

[โ€“] [email protected] 2 points 8 months ago (1 children)

@ajsadauskas @degoogle So, classic mid-90s Yahoo. Or LookSmart, which was initially curated by Reader's Digest.

[โ€“] [email protected] 2 points 8 months ago

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

True, but these things can also be used by us, to curate/maintain a high quality link collection. However, I'm not sure 'pages' will be read by humans in 5 years, so I have a feeling we wont need such a collection anymore. Well, not for humans but probably for our individual LLM's.

[โ€“] [email protected] 2 points 8 months ago

Just to add to your list of steps and consequences: I also think academic studies about information retrieval, indexing and crawling became less popular. Aspirant students hearing the message: those studies / workfields will become obsolete once AI does all that.

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas @degoogle I've always wanted to try or contribute to one of these!

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas I think Github's awesome lists are kind of like this. They're human-maintained catalogues of worthwhile websites on a specific topic.

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas Lemmy instances without comments?

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas @degoogle

It would be sad to go back to walled gardens like AOL, particularly since they were corporate-owned. But a sort of Kite Mark, certifying a site is free of LLMs, would be useful. Then users could choose for themselves.

[โ€“] [email protected] 1 points 8 months ago
[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas sounds like you want https://curlie.org/ - which seems to be up to date and interesting.

[โ€“] [email protected] 1 points 8 months ago

@ajsadauskas @degoogle ah the good ol' days. I was a curator on yahoo's directory for a few years, before it ended.

[โ€“] [email protected] 1 points 8 months ago (1 children)

@ajsadauskas @degoogle it sounds a bit like Kagiโ€˜s Small Web initiative and search. have you seen it? https://blog.kagi.com/small-web

load more comments (1 replies)
[โ€“] [email protected] 1 points 8 months ago
load more comments
view more: โ€น prev next โ€บ