this post was submitted on 10 May 2025
623 points (98.0% liked)
Technology
69891 readers
2714 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If they are training the AI with copyrighted data that they aren't paying for, then yes, they are doing the same thing as traditional media piracy. While I think piracy laws have been grossly blown out of proportion by entities such as the RIAA and MPAA, these AI companies shouldn't get a pass for doing what Joe Schmoe would get fined thousands of dollars for on a smaller scale.
The act of copying the data without paying for it (assuming it's something you need to pay for to get a copy of) is piracy, yes. But the training of an AI is not piracy because no copying takes place.
A lot of people have a very vague, nebulous concept of what copyright is all about. It isn't a generalized "you should be able to get money whenever anyone does anything with something you thought of" law. It's all about making and distributing copies of the data.
Where does the training data come from seems like the main issue, rather than the training itself. Copying has to take place somewhere for that data to exist. I'm no fan of the current IP regime but it seems like an obvious problem if you get caught making money with terabytes of content you don't have a license for.
the slippery slope here is that you as an artist hear music on the radio, in movies and TV, commercials. All this hearing music is training your brain. If an AI company just plugged in an FM radio and learned from that music I'm sure that a lawsuit could start to make it that no one could listen to anyone's music without being tainted.
That feels categorically different unless AI has legal standing as a person. We're talking about training LLMs, there's not anything more than people using computers going on here.