195
this post was submitted on 17 Aug 2023
195 points (100.0% liked)
Technology
37724 readers
566 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
That's something that can currently be done by a human and is generally considered fair use. All a language model really does is drive the cost of doing that from tens or hundreds of dollars down to pennies.
A fair use defense does not have to include noncompetition. That's just one factor in a fair use defense and the other factors may be enyon their own.
I think it'll come down to how "the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes" and "the amount and substantiality of the portion used in relation to the copyrighted work as a whole;" are interpreted by the courts. Do we judge if a language model by the model itself or by the output itself? Can a model itself be uninfringing and it still be able to potentially produce infringing content?
The model is intended for commercial use, uses the entire work and creates derivative works based on it which are in direct competition.
You are kind of hitting on one of the issues I see. The model and the works created by the model may b considered two separate things. The model itself may not be infringing in of itself. It's not actually substantially similar to any of the individual training data. I don't think anyone can point to part of it and say this is a copy of a given work. But the model may be able to create works that are infringing.