this post was submitted on 15 Mar 2024
487 points (95.3% liked)

Technology

58303 readers
8 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 8 months ago

Obviously nobody fully knows where so much training data come from. They used Web scraping tool like there's no tomorrow before, with that amount if informations you can't tell where all the training material come from. Which doesn't mean that the tool is unreliable, but that we don't truly why it's that good, unless you can somehow access all the layers of the digital brains operating these machines; that isn't doable in closed source model so we can only speculate. This is what is called a black box and we use this because we trust the output enough to do it. Knowing in details the process behind each query would thus be taxing. Anyway...I'm starting to see more and more ai generated content, YouTube is slowly but surely losing significance and importance as I don't search informations there any longer, ai being one of the reasons for this.