this post was submitted on 01 Apr 2025
253 points (97.7% liked)

Technology

68187 readers
3835 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 2 days ago (12 children)

Prove it. Please, show me the full training data to guarantee you're right.

But also, all the kids used for "kids face data" didn't sign up to be porn

[–] [email protected] 6 points 1 day ago (11 children)

I don't need to. It's is just the way gen AI works. It takes images of things it knows and then generates NEW content based on what it think you want with your prompts.

If I'm looking for a infant flying an airplane, gen AI knows what a pilot looks like and what a child looks like and it creates something new.

Also kids face data doesn't mean they take the actual face of the actual child and paste it on a body. It might take an eyebrow and a freckle from one kidand use a hair style from another and eyes from someone else.

Lastly, the kids parents consented when they upload images of their kids on social media.

[–] [email protected] 3 points 1 day ago (10 children)

If you think that AI is only trained on legal images, I can't convince you otherwise.

[–] [email protected] -3 points 1 day ago (1 children)

What AI are you talking about? Are you suggesting the commercial models from OpenAI are trained using CP? Or just that there are some models out there that were trained using CP? Because yeah, anyone can create a model at home and train it with whatever. But suggesting that OpenAI has a DB of tagged CP is a different story.

[–] [email protected] 5 points 1 day ago (1 children)

Open AI just scours the Internet. 100% chance it's come across someone illegal and horrible. They don't pre-approve its training data.

[–] [email protected] -1 points 1 day ago (1 children)

But you have to describe it. It doesn't just suck in images at random. I imagine someone will remove CP when the images are reviewed. Or do you think they just download all images and add them to the training set without even looking at them?

[–] [email protected] 1 points 18 hours ago (1 children)

I think that's exactly what they do. Curation at the quantities that they're working at would require an army.

[–] [email protected] 1 points 9 hours ago

So you think to train AI you just show it random images without describing what they represent and AI just magically learns? If I then ask AI to create an image of a computer, how does it know what a computer is? Does it just learn this on it's own from all the random images?

load more comments (8 replies)
load more comments (8 replies)
load more comments (8 replies)