this post was submitted on 31 Aug 2023
596 points (97.9% liked)
Technology
58303 readers
3 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It takes so.much money to retrain models tho...like the entire cost all over again ...and what if they find something else?
Crazy how murky the legalities are here ..just no caselaw to base anything on really
For people who don't know how machine learning works at a very high level
basically every input the AI is trained on or "sees" changes a set of weights (float type decimal numbers) and once the weights are changed you can't remove that input and change the weights back to what they were you can only keep changing them on new input
So we just let them break the law without penalty because it's hard and costly to redo the work that already broke the law? Nah, they can put time and money towards safeguards to prevent themselves from breaking the law if they want to try to make money off of this stuff.
No one has established that they've broken the law in any way, though. Authors are upset but it's unclear if they can prove they were damaged in some way or that the companies in question are even liable for anything.
Remember,the burden of proof is on the plaintiff not these companies if a suit is brought.
I'm european. I have a right to be forgotten.
The "safeguard" would be "no PII in training data, ever". Which is fine by me, but that's what it really means. Retraining a large dataset every time a GDPR request comes in is completely infeasible.