this post was submitted on 12 Jul 2024
564 points (98.3% liked)
Technology
58303 readers
25 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Closing the door behind the ones that already did it means only the current groups that have the data will make money of it.
Is it Copyright content?
This regulation (and similar being proposed in California) would not be applied retroactively.
Never mentioned any sort of retroactive measures.
Since no retroactive measures are mentioned, the companies that already scraped the web won't be stopped from continuing to use the AI models already trained on that data, but anyone else would be stopped by the law.
It is like making it illegal to rob banks after someone already robbed all the banks and letting them keep all the money.
The law could have made it illegal for use of models trained on the copyrighted materials without permission instead of targeting the process for collecting it.
Downvote all you want. If your entire business or personal model includes stealing content from other people, then you need to rethink that.
"stealing" implies the owner does not have it anymore... It is large studio speak.
And I get what you are trying so say, I just think the copyright system is so broken that this shows it is in need of reform. Because if the qualm is with people doing immoral shit as a business model, there are long lists of corporations that will ask you to hold their beer.
And the fact that the training of the models already occurred on these materials means that the owners of the current models are probably training on generated datasets meaning that by the time this actually hits court, the datasets with original copyrighted materials will be obsolete.
Regarding obsolete models, that's only partially true. There's loads of content that are effectively "finished" and won't be changing, and will grow obsolete at a fairly slow pace. Meaning they'll be useful in the models once trained for years.
Obviously new technology and similar ideas/content that didn't exist when the model was created won't be there, but the amount that changes and or is new is relatively small each year compared to all the historical content.
Well that's a well articulated reply.
I don't understand why you would take this position. Because the small artists will never be able to avoid Beiing included in training sets, and if they are what are they going to do against a VC backed corpnlike OpenAI. All the while the big copyright "owners" will be excluded. Meaning this only cements the position of the mega corps.
Comment breaks community standards.
Openai exec: oh shit damn,. Damn. I gotta call my mom.
Stealing: depriving you of what you own
Copying: taking a picture of what you made.
Stealing is not copying. You still have whatever you started with.