this post was submitted on 01 Sep 2024

329 points (91.6% liked)

Technology

58303 readers

23 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

329

Maybe It Should Be Illegal To Instantly Delete A Website's Archives - Aftermath (aftermath.site)

submitted 2 months ago by [email protected] to c/[email protected]

90 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 41 points 2 months ago (4 children)

That being said, if a third party, like the Internet Archive, wants to archive it they should have every right.

[–] [email protected] 16 points 2 months ago (1 children)

Maybe for sites from corporations or similar sources. But people should have always have the right to be forgotten. And in fact in some countries they do have this right.

[–] [email protected] 23 points 2 months ago

Want to be forgotten is about personally identifiable information. Other work, which is covered under copyright, which means if someone has legally obtained a copy of it, as long as they're not distributing it, is their right to do whatever the fuck they want with it. Even hold it until the copyright expires at which point they can publish it as much as they want.

[–] [email protected] 5 points 2 months ago (1 children)

A "Library of Congress" for published web content maybe. Some sort of standard that allows / requires websites that publish content on oublic-facing sites to also share a permanent copy with an archive, without having the archive have to scrape it.

Sort of like how book publishers send a copy to the LoC.

[–] [email protected] 0 points 2 months ago

I don't think requiring is a great idea, but definitely making the standard that you can do if you want would be very cool.

[–] [email protected] -3 points 2 months ago* (last edited 2 months ago) (2 children)

This is just like AI scraping

Edit if you allow a third party to "archive" your content, the ship has sailed. I'm not advocating for or against anything but once your stuff is scraped (by anyone) it's gone.

[–] [email protected] 3 points 2 months ago

Yes except AI companies are making mad cheddar.

[–] [email protected] 2 points 2 months ago (1 children)

Not really. If the archive decides to publish your work, that's copyright infringement. If an AI company decides to scrape your content and develop an AI with your content, I would argue that that's a derivative work, which is also protected by copyright.

[–] [email protected] 0 points 2 months ago* (last edited 2 months ago) (1 children)

I'm not discussing what they do with it, I'm discussing the raw act of ingesting your page.

Cats and bags

To venture into opinion, I think there shouldn't be "every right" to archive your page, for any purposes such as archive or ai or whatever.

Edit but I acknowledge how the open internet works and the futility of trying to control that

[–] [email protected] 2 points 2 months ago (1 children)

It seems like a very dangerous, very slippery slope. The first people to abuse this would be the big corporations who want to hide and cover up as much as they possibly can. I think the copyright law framework is a useful lens to view this with which I outlined in my response above.

[–] [email protected] 1 points 2 months ago (1 children)

Totally get what you're saying, but I'm highlighting the mechanical step of a third party having "every right" to scrape or persist your content is in complete contrast to the other points in this thread about rights to be forgotten and so on.

[–] [email protected] 2 points 2 months ago (1 children)

Right to be forgotten is specifically for personally identifiable information. And I'm pretty sure it's sound on copyright grounds as long as you don't distribute. And honestly, I don't really see a problem with it.

[–] [email protected] 0 points 2 months ago* (last edited 2 months ago) (1 children)

And if you've made a personal website, say, with a blog of your valuable ideas/art (valuable to you, or anyone, arbitrarily), the ability to erase your site represents forgetting. The whole site may contain your PII throughout.

Any scraping or archiving techniques degrade that right.

[–] [email protected] 1 points 2 months ago

You have a right to be forgotten. Your ideas and the work you create does not.

[–] [email protected] -3 points 2 months ago* (last edited 2 months ago) (2 children)

I'm not sure if i can agree with that. A third party cannot simply override the rights of the owner. If i want my website gone, i want it gone from everywhere. no exception.

That kinda also goes in the whole "Right to be forgotten" direction. I have absolute sovereignty over my data. This includes websites created by me.

[–] [email protected] 9 points 2 months ago (3 children)

Yes they can, otherwise Disney can decide that that DVD you bought 10 years ago, you're no longer allowed to have and you must destroy it.

Right to be forgotten is bullshit, not from an ideological standpoint right, but purely from a practicality stand point the old rule of once its on the internet its on the internet forever stands true. That's not even getting started on the fact that right to be forgotten is about your personal information, not any material you may publish that is outside of that.

[–] [email protected] 8 points 2 months ago (1 children)

Disney can decide to terminate that license but the disc is another story. The license is for the media on the disc but the physical disc itself is owned by the person who bought it. This is literally why a company can remove a show or movie or song from your digital library. The license holder can always revoke the license. It was harder to enforce with physical media (and cost prohibitive in a lot of cases), but still possible.

[–] [email protected] 4 points 2 months ago (1 children)

No, they can't Google first sale doctrine.

They can remove shit from your digital library because in page 76 of the terms and conditions that you didn't read, they redefined the word purchase to mean temporarily rent.

[–] [email protected] 1 points 2 months ago* (last edited 2 months ago)

It's the same licensing agreement. I phrased what I said to specifically adhere to what they say in their own terms of use in accordance with FCC regulation.

https://disneytermsofuse.com/english/

If you were to, say in 1990, get caught broadcasting your copy of a Disney movie without the legal ability to do so, they could absolutely use the court system to revoke your right to the licensed copy of that media and have it confiscated.

[–] [email protected] 4 points 2 months ago* (last edited 2 months ago) (1 children)

No. When you purchase the dvd you become the owner of that specific disc... you never gained ownership of my website just because you visited and copied my content.

[–] [email protected] -1 points 2 months ago (2 children)

Yes, and when I archived your website, I became the owner of that specific copy of your website.

[–] [email protected] 4 points 2 months ago (2 children)

No, I never granted you any ownership of my content. Period. You didn't pay me, you didn't engage in any contract with me.

Simply archiving my stuff and running away then publishing it as your own is theft.

[–] [email protected] 5 points 2 months ago (2 children)

You've put it out there for free, though, and the data literally ends up on my machine because you made it do that, so what's the problem with me saving the data on my machine for later, and potentially sharing it elsewhere for free again?

then publishing it as your own is theft

This scenario (misattribution of content) has nothing to do with the previous discussion. The other commenter is making an analogy to CDs, owning a CD and lending it to others doesn't mean you're claiming its content is your own creation.
Theft implies deprivation of ownership. Calling this theft is like calling piracy theft. It may be illegal by this or that metric, but it's not normal theft.

[–] [email protected] 3 points 2 months ago

Well the whole premise of their argument is flawed because they're basing it on the fact of redistribution. If I'm not redistributing it, then the whole argument of that falls away entirely. Under fair use, I believe you're also allowed to make copies of things for research purposes, so I'd argue that's what an archive is.

[–] [email protected] -1 points 2 months ago (1 children)

You’ve put it out there for free

Irrelevant. It's still my content that I have sole rights to. If I want to share it to individuals I can do that if I please. You don't have any rights to do anything else with it.

and the data literally ends up on my machine because you made it do that

Incorrect. Your browser made it do that. How that data is accessed and displayed is not controlled by me. Case and point you can have extensions on your browser that changes how my websites are rendered.

That doesn't give you a right to replicate my content elsewhere.

and potentially sharing it elsewhere for free again?

Because it's not yours? And publishing it again elsewhere is effectively you claiming it is yours. Especially if published without attribution.

You guys can't have this both ways. If an artist makes a painting... and posts a picture of it. They have no rights to the painting anymore? They deserve no ownership/pay for what they've done? If a news story is published... They have no rights to sell that story to another publisher just because you can copy and paste the text? This is absurd logic. My website has/had a cost. I bore it. I have sole rights to that content.

This scenario (misattribution of content) has nothing to do with the previous discussion. The other commenter is making an analogy to CDs, owning a CD and lending it to others doesn’t mean you’re claiming its content is your own creation.

No, this has to do with rights of the content. Owning the CD grants you a license to the content on that CD. That's about as good as ownership gets there. They own the CD/license. As long as that CD exists/works. You don't gain that same right by simply visiting a website.

Theft implies deprivation of ownership. Calling this theft is like calling piracy theft. It may be illegal by this or that metric, but it’s not normal theft.

No it doesn't. Taking content and using in an unauthorized way while gaining money or some other consideration is also theft. Wayback Machine and other archives are paid for somehow. If some content being on a site swayed someone to make a donation to that archive site, then that value should have gone to the original creator. That is theft. This is the core of most of the current lawsuits. Although they often equate this to "potential and future earnings" which is bullshit because oftentimes that content would never be have been viewed at whatever cost they ascribed.

[–] [email protected] 4 points 2 months ago (1 children)

You don’t have any rights to do anything else with it.

That's patently false. At a minimum, I can quote parts of your content, just as you can quote smaller portions of any published text anywhere, you don't have to ask the publisher or author for permission. It's also ridiculous and impossible to control, the content is on my private machine already, how can any law be relevant or exerted upon what I do there? I doubt you're writing this comment on the basis of your knowledge of copyright law.

Incorrect. Your browser made it do that. How that data is accessed and displayed is not controlled by me.

You're arguing semantics that really don't make any difference. The display is irrelevant, because the data by itself is stored on my computer before it is displayed. That data is what you've put up online to be accessed.

Owning the CD grants you a license to the content on that CD. That’s about as good as ownership gets there. They own the CD/license. As long as that CD exists/works. You don’t gain that same right by simply visiting a website.

I fail to see the difference between getting a CD with some data (buying it or being given for free, as e.g. a gift) and being sent some data online for free. More importantly - says who? Does copyright law say this about websites?

If an artist makes a painting… and posts a picture of it. They have no rights to the painting anymore? They deserve no ownership/pay for what they’ve done?

This simply doesn't follow from what I've written. They certainly retain the rights to the painting. Besides, "deserving pay" depends on completely different factors than the ones we're discussing, usually artists sell the actual object, the painting. A digital reproduction is, as far as most people care (I think), merely an informative reproduction, and not the real thing. Stuff that's posted online for free is... free. It wasn't intended to be made money with directly.

Your final paragraph is really confusing me, you seem to be saying that Wayback Machine is also committing theft, which I'm pretty sure is not true (I've followed the lawsuits against IA for a while and don't remember anyone invoking that term). And at this point I don't know what "theft" is even supposed to mean to you or to anyone else, and what was the point of the discussion anyway. Maybe I should reread the whole discussion carefully all over again, but I'm on my phone and it's all giving me a headache.

[–] [email protected] -1 points 2 months ago (1 children)

the content is on my private machine already, how can any law be relevant or exerted upon what I do there?

So child porn is okay then? You would already have it on your system and got it for free on your private machine!

I doubt you’re writing this comment on the basis of your knowledge of copyright law.

I doubt you are either. Yet we're both here.

you seem to be saying that Wayback Machine is also committing theft

It does... on paper... A lot. https://time.com/6266147/internet-archive-copyright-infringement-books-lawsuit/ To the point it's losing lawsuits over exactly that.

[–] [email protected] 2 points 2 months ago* (last edited 2 months ago) (1 children)

So child porn is okay then? You would already have it on your system

You'd have to look for it, knowing fully well that it is illegal to produce in the first place and distribute to others, access it online, and then deliberately retain it. It's not really the same as something that's legal to produce and distribute (it is certainly legal for me to view your site). You wouldn't "already" have it.

I doubt you are either.

Well I've read some copyright laws, had to solve some issues regarding usage of copyrighted works, etc. Nothing that makes me an expert, but I'm not talking wholly out of my ass either.

It does… on paper… A lot. https://time.com/6266147/internet-archive-copyright-infringement-books-lawsuit/ To the point it’s losing lawsuits over exactly that.

That's not Wayback Machine per se, that's Internet Archive's book scanning and "digital lending" system, which was most definitely doing legally questionable (and stupid) things even to an amateur eye. However, Wayback Machine making read-only copies of websites has for now never been disputed successfully.

[–] [email protected] 2 points 2 months ago

You wouldn’t “already” have it.

You've missed the point. Simply having something on your harddrive is already something the law does care about. It simply depends on the something.

Well I’ve read some copyright laws

So have I. Because I had access to an exception under it in my prior job. Seems like we're still on the same page here. Not sure why you'd feel the need to call out someone else's knowledge on a topic that you have no idea about.

However, Wayback Machine making read-only copies of websites has for now never been disputed successfully.

Except it has. That's why administrators can exclude domains from it. DMCA notices also can yield complete removals.

[–] [email protected] 3 points 2 months ago* (last edited 2 months ago) (1 children)

Copyright only protects distribution and derivative works. I can keep a copy of it on my local machine for as long as I want. Theoretically I can keep it until the copyright expires and then I can do whatever the fuck I want with it.

[–] [email protected] -2 points 2 months ago (1 children)

I can keep it until the copyright expires and then I can do whatever the fuck I want with it.

general copyright is 70 years. So no. You couldn't do whatever you wanted with it as the computer you're using would be long dead... and possibly you'd even be long dead. Replicating the content to another device without owners consent could and likely would be a violation of that same copyright.

[–] [email protected] 2 points 2 months ago

Replicating a personal backup to another device is covered by free use. Only distribution and derivative works are covered by copyright.

And yes, the length of copyright is way too long. It recon it should be the same as patents, 20 years. Or let it be as long as the warranty and let the big companies duke it out with each other.

[–] [email protected] 4 points 2 months ago (2 children)

I'd better never see you bitching about AI scraping your content. I'll remind you of this very comment.

[–] [email protected] 3 points 2 months ago

For what it's worth, I agree with the other commenter and, as much as I dislike AI as it currently is, I have never and probably never will bitch about the scraping. If I put things out there online, I am aware that they may be used in ways that I never intended. That's how it has always been, after all.

[–] [email protected] 2 points 2 months ago

I would argue that AI is a derivative work and that is protected by copyright. Archiving a copy of something and keeping it for personal use is not derivative work and not distribution and that's not protected by copyright.

[–] [email protected] 0 points 2 months ago (1 children)

You compare entirely different things here. I'm talking about a website i own not a product i sell. And no, this "on the internet forever" is complete and utter nonsense that was never true to begin with. the amount of stuff lost to time easely dwarfs the one still around.

[–] [email protected] 1 points 2 months ago (1 children)

You chose to distribute said website to everyone on the internet. I chose to exercise my rights of fair use to make a local convenience copy of said website. I can then theoretically hold, said local convenience copy, for as long as I want, until your copyright expires, at which point I can publish it.

It's a bold assumption that that data is not just sitting on someone's hard drive somewhere.

[–] [email protected] 1 points 2 months ago* (last edited 2 months ago)

You are moving the goalpost. again. The talk was about the Internet Archive providing a copy of my website to the public. Not you storing it somewhere on your drive for personal use. Although that's also a rather tricky legal matter.

But nice for you to agree with the rest. Yes, you could at one point publish a copy. 70 Years after my death. and not a second before that. and only if its not specific protected because i contains personal information. i think the protection is not limited in that case.

[–] [email protected] 3 points 2 months ago* (last edited 2 months ago) (1 children)

Information doesn't have "owners." It only has -- at most -- "copyright holders," who are being allowed to temporarily borrow control of it from the Public Domain.

[–] [email protected] 3 points 2 months ago

Imagine that absolute historical clusterfuck if terrible politicians and bad actors could just delete entire portions of their history.