Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
view the rest of the comments
StackOverflow: *grabs money on monetizing massive amounts of user-contributed content without consulting or compensating the users in any way*
Users: *try to delete it all to prevent it*
StackOverflow: *your contributions belong to the community, you can't do that*
Pretty fucked-up laws. A lot of lawsuits going on right now against AI companies for similar issues. In this case, StackOverflow is entitled to be compensated for its partnership, and because the answers are all CC BY-SA 3.0, no one can complain. Now, that SA? Whatever.
That SA part needs to be tested in court against the AI models themselves
A lot of this shittiness would probably go away if there was a risk that ingesting certain content would mean you need to release the actual model to the public.
Yeah, their assumption though is you don't? Neither attribution nor sharealike, not even full-on all-rights-reserved copyright is being respected. Anything public goes and if questions are asked it's "fair use". If the user retains CC BY-SA over their content, why is giving a bunch of money to StackOverflow entitling OpenAI to use it all under whatever terms they settled on? Boggles me.
Now, say, Reddit Terms of Service state clearly that by submitting content you are giving them the right to "a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness (...) in all media formats and channels now known or later developed anywhere in the world." Speaks volumes on why alternatives (like Lemmy) to these platforms matter.
That's interesting. I was looking up "Lemmy Terms of Service" for comparison after getting that quote from the Reddit ToS and could not find anything for Lemmy.ml. Now after you mentioned it, looking on my Mastodon instance, nothing either, just a privacy policy. That is indeed kinda weird. Some instances do have their own ToS though. At least something stating a sublicense for distribution should be there for protection of people running instances in locations where it's relevant.
I mean, how do you do that for a closed-source model with secretive training data? As far as I know, OpenAI has admitted to using large amounts of copyrighted content, numberless books, newspaper material, all on the basis of fair use claims. Guess it would take a government entity actively going after them at this point.
Thank you for sharing. Your perspective broadens mine, but I feel a lot more negative about the whole "must benefit business" side of things. It is fruitless to hold any entity whatsoever accountable when a whole worldwide economy is in a free-for-all nuke-waving doom-embracing realpolitik vibe.
Frankly, not sure what would be worse, economic collapse and the consequences to the people, or economic prosperity and... the consequences to the people. Long term, and from a country that is not exactly thriving in the scheme side of things, I guess I'd take the former.
Yep. Can't wait to overfit LLM to a lot of copyrighted work and share it to public domain. Let's see if OpenAI will get push back from copyright owner down the road.