this post was submitted on 22 Dec 2024
1469 points (97.6% liked)
Technology
60060 readers
3358 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
As per torrentfreak
Should be easy to defend against, right-out trivial: OpenAI, just tell us what those Books1 and Books2 databases are. Where you got them from, the licensing contracts with publishers that you signed to give you access to such a gigantic library. No need to divulge details, just give us information that makes it believable that you licensed them.
...crickets. They pirated the lot of it otherwise they would already have gotten that case thrown out. It's US startup culture, plain and simple, "move fast and break laws", get lots of money, have lots of money enabling you to pay the best lawyers to abuse the shit out of the US court system.
For OpenAI, I really wouldn't be surprised if that happened to be the case, considering they still call themselves "OpenAI" despite being the most censored and closed source AI models on the market.
But my comment was more aimed at AI models in general. If you are assuming they indeed used non-publicly posted or gathered material, and did so directly themselves, they would indeed not have a defense to that. Unfortunately, if a second hand provided them the data, and did so under false pretenses, it would likely let them legally off the hook even if they had every ethical obligation to make sure it was publicly available. The second hand that provided it to them would be the one infringing.
If that assumption turns out to be a truth (Maybe through some kind of discovery in the trial), they should burn for that. Until then, even if it's a justified assumption, it's still an assumption, and most likely not true for most models, certainly not those trained recently.