this post was submitted on 12 Jun 2024
238 points (99.6% liked)

Fediverse

27910 readers
1 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to [email protected]!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago
MODERATORS
 

Maven, a new social network backed by OpenAI's Sam Altman, found itself in a controversy today when it imported a huge amount of posts and profiles from the Fediverse, and then ran AI analysis to alter the content.

you are viewing a single comment's thread
view the rest of the comments
[–] verstra 13 points 5 months ago (5 children)

Oh shit, the persona guy was right! We should all be adding license to our comments, so could not legally train model that are then used for commercial purposes.

[–] [email protected] 18 points 5 months ago (2 children)

The easiest way is a sitewide NoAI meta tag, since it’s the current standard. Researchers are much more likely to respect a common standard and extremely unlikely to respect a single user’s personal solution adding a link to their comments.

[–] [email protected] 6 points 5 months ago

This is the only way I see it being acceptable. How do we add this to instances?

[–] [email protected] 4 points 5 months ago (1 children)

I feel like the bad thing about this is, whereas the researchers will mostly respect this, companies who want to make money out of data will still secretly keep using the data anyways. I am more ok with the data being used for non-profit research and not for making money but this would likely have the opposite effect.

[–] [email protected] 1 points 5 months ago (1 children)

If that’s truly the case, nothing on earth can protect your data.

That being said, large corporations are far more liable to consumer protection lawsuits, especially in areas like the EU.

[–] [email protected] 2 points 5 months ago* (last edited 5 months ago)

They also have enough lawyer power to find loop holes. Stuff like if your main compute cluster is in xyz state or in xyz islands then you can get away with a fine the fraction what you can make with this data.

[–] [email protected] 7 points 5 months ago (1 children)
[–] onlinepersona 7 points 5 months ago* (last edited 5 months ago)

Thanks for linking me 🙏 The makers of Maven probably set off a bomb now and people might ask for anti-AI features on the clients and servers.

Anti Commercial-AI license

[–] [email protected] 4 points 5 months ago (1 children)

yeah they were. I hope more people start doing it even if it doesn't legally hold water its still a good way to show that fediverse users won't stand for that.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] onlinepersona 1 points 5 months ago (1 children)

Why do you think it won't hold water legally? There's a case going right now against Github Copilot for scraping GPL licences code, even spitting it back out verbatim, and not making "open" AI actually open.

Creative Commons is not a joke licence. It actually is used by artists, authors, and other creative types.

Imagine Maven or another company doing the same shit they just did and it coming to light there were a bunch of noncommercially licences content in there. The authors could band together for a class action lawsuit and sue their asses. Given the reaction of users here and on mastodon, I wouldn't even be surprised if it did happen.

Anti Commercial-AI license

[–] [email protected] 1 points 5 months ago (1 children)

I mostly mention that to fend off the people that use the main basis of their argument as the effectiveness because that's not why I'm doing it.

I do think it could work legally if the courts want to remain consistent, but that isn't guaranteed.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] [email protected] 1 points 5 months ago (1 children)

Don't we also need a critical mass of people adding licenses to posts? So that a class action suit can be launched. Because it would be inviable and a very rapid path to self-defeat if people started to try and individually sue big corpo.

Also I'm missing a way to automatically add this to my posts. Something like a browser extension.

This post is licensed under CC BY-NC-SA 4.0.

[–] [email protected] 1 points 5 months ago (1 children)

Yeah the more people the better so its easier to have a class action lawsuit.

Also for me I'm using a text expander so that after I type a shortcut it automatically adds the rest of the text for me.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] [email protected] 1 points 5 months ago (1 children)

Also for me I’m using a text expander so that after I type a shortcut it automatically adds the rest of the text for me.

I request of you, show me your ways!

[–] [email protected] 1 points 5 months ago

Well on firefox/chrome extensions you can search for text expander and choose an extension that works for you.

Or if you are using a phone you can do the same on the app store and I think there should be a few options.

Once you download one of them it should give instructions on how to use it, but in general it asks you to create a phrase that you want to be automatically triggered and a shorter phrase that automatically replaced with the longer phrase.

For example-

long phrase: The quick brown fox jumped over the moon.

short phrase: /qfox

and every time you typed /qfox it would replace it with "The quick brown fox jumped over the moon."

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

[–] onlinepersona 1 points 5 months ago (1 children)

It's especially for these kinds of dumb cases where they simply copy content wholesale and boast about it. With more people licencing their contents as non commercial, the "hot water" these companies get in could not just be trivial but actually legal.

Would be great if web and mobile clients supported signatures or a "licence" field from which signatures were generated. Even better would be if people smarter than me added a feature to poison AI training data. This could also be done by a signature or some other method.

Anti Commercial-AI license

[–] [email protected] 1 points 5 months ago

I don't know; AFAIK, Reddit successfully argued that they own Wallstreetbets' trademarks in court. That might void all of these licenses depending on the ToS of the instance being used.

[–] [email protected] 0 points 5 months ago

Lol that shit don't do shit