this post was submitted on 16 Mar 2025
492 points (97.7% liked)

Privacy

35608 readers
946 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Related communities

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Zink 17 points 19 hours ago (2 children)

What I’m hearing between the lines here is the origin of a legal “argument.”

If a person’s mind is allowed to read copyrighted works, remember them, be inspired by them, and describe them to others, then surely a different type of “person’s” different type of “mind” must be allowed to do the same thing!

After all, corporations are people, right? Especially any worth trillions of dollars! They are more worthy as people than meatbags worth mere billions!

[–] [email protected] 4 points 14 hours ago* (last edited 14 hours ago) (1 children)

I don't think it's actually such a bad argument because to reject it you basically have to say that style should fall under copyright protections, at least conditionally, which is absurd and has obvious dystopian implications. This isn't what copyright was meant for. People want AI banned or inhibited for separate reasons and hope the copyright argument is a path to that, but even if successful wouldn't actually change much except to make the other large corporations that own most copyright stakeholders of AI systems. That's not really a better circumstance.

[–] [email protected] 3 points 12 hours ago* (last edited 12 hours ago) (1 children)

Actually I would just make the guard rails such that I’d the input can’t be copyrighted then the ai output can’t be copyrighted either. Making anything it touches public domain would reel in the corporations enthusiasm for its replacing humans.

[–] [email protected] 3 points 11 hours ago* (last edited 11 hours ago)

I think they would still try to go for it but yeah that option sounds good to me tbh

[–] [email protected] 6 points 19 hours ago* (last edited 18 hours ago) (2 children)

This has been the legal basis of all AI training sets since they began collecting datasets. The US copyright office heard these arguments in 2023: https://www.copyright.gov/ai/listening-sessions.html

MR. LEVEY: Hi there. I'm Curt Levey, President of the Committee for Justice. We're a nonprofit that focuses on a variety of legal and policy issues, including intellectual property, AI, tech policy. There certainly are a number of very interesting questions about AI and copyright. I'd like to focus on one of them, which is the intersection of AI and copyright infringement, which some of the other panelists have already alluded to.

That issue is at the forefront given recent high-profile lawsuits claiming that generative AI, such as DALL-E 2 or Stable Diffusion, are infringing by training their AI models on a set of copyrighted images, such as those owned by Getty Images, one of the plaintiffs in these suits. And I must admit there's some tension in what I think about the issue at the heart of these lawsuits. I and the Committee for Justice favor strong protection for creatives because that's the best way to encourage creativity and innovation.

But, at the same time, I was an AI scientist long ago in the 1990s before I was an attorney, and I have a lot of experience in how AI, that is, the neural networks at the heart of AI, learn from very large numbers of examples, and at a deep level, it's analogous to how human creators learn from a lifetime of examples. And we don't call that infringement when a human does it, so it's hard for me to conclude that it's infringement when done by AI.

Now some might say, why should we analogize to humans? And I would say, for one, we should be intellectually consistent about how we analyze copyright. And number two, I think it's better to borrow from precedents we know that assumed human authorship than to invent the wheel over again for AI. And, look, neither human nor machine learning depends on retaining specific examples that they learn from.

So the lawsuits that I'm alluding to argue that infringement springs from temporary copies made during learning. And I think my number one takeaway would be, like it or not, a distinction between man and machine based on temporary storage will ultimately fail maybe not now but in the near future. Not only are there relatively weak legal arguments in terms of temporary copies, the precedent on that, more importantly, temporary storage of training examples is the easiest way to train an AI model, but it's not fundamentally required and it's not fundamentally different from what humans do, and I'll get into that more later if time permits.

The "temporary storage" idea is pretty central for visual models like Midjourney or DALL-E, whose training sets are full of copyrighted works lol. There is a legal basis for temporary storage too:

The "Ephemeral Copy" Exception (17 U.S.C. § 112 & § 117)

U.S. copyright law recognizes temporary, incidental, and transitory copies as necessary for technological processes.
Section 117 allows temporary copies for software operation.
Section 112 permits temporary copies for broadcasting and streaming.
[–] [email protected] 2 points 12 hours ago (1 children)

Based on this, can I use chat gpt to recreate a Coca Cola recipe

[–] [email protected] 2 points 11 hours ago* (last edited 11 hours ago)

Copyright law doesn't cover recipes - it's just a "trade secret". But the approximate recipe for coca cola is well known and can be googled.

[–] [email protected] 3 points 15 hours ago

BTW, if anyone was interested - many visual models use the same training set, collected by a German non-profit: https://laion.ai/

It's "technically not copyright infringement" because the set is just a link to an image, paired with a text description of each image. Because they're just pointing to the image, they don't really have to respect any copyright.