this post was submitted on 20 Oct 2024

507 points (99.8% liked)

Technology

59557 readers

3095 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

507

Internet Archive breached again through stolen access tokens (www.bleepingcomputer.com)

submitted 1 month ago by [email protected] to c/[email protected]

63 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 107 points 1 month ago (1 children)

I guess this is an attempt to discredit them.

After working at many, many companies, security is usually very bad. This is typical. Not changing access tokens is also very common.

[–] [email protected] 26 points 1 month ago (3 children)

Discrediting someone usually has a goal of pushing customers to another source though. There is no other source of this information, so what would be the point?

[–] [email protected] 108 points 1 month ago* (last edited 1 month ago) (5 children)

Destroy a source of historical documents so that the past can be contested. Sow doubt, confusion, deniability. Hide evidence of past crimes, or inconvenient documents. Plant documents, etc.

[–] [email protected] 20 points 1 month ago (1 children)

Now we are talking.

[–] [email protected] 2 points 1 month ago

I really hate that reddit slang but username checks out

[–] [email protected] 9 points 1 month ago

Russians banned it, russian hackers trying to destroy it, at least it's consistent

[–] [email protected] 8 points 1 month ago

He who controls the past controls the future, he who controls the present controls the past.

[–] [email protected] 2 points 1 month ago (1 children)

Sow doubt. As in spreading it like seeds to take root and grow. 100% in agreement with you, just being a grammar Nazi. Carry on.

load more comments (1 replies)

[–] [email protected] 18 points 1 month ago

Generating turmoil just prior to the USA election maybe?

load more comments (1 replies)

[–] [email protected] 81 points 1 month ago (2 children)

Okay, enough is enough. The Internet Archive is both essential infrastructure and irreplaceable historical record; it cannot be allowed to fall. Rather than just hoping the Archive can defend itself, I say It's time to hunt down and counterattack the scum perpetrating this!

[–] [email protected] 37 points 1 month ago* (last edited 1 month ago) (3 children)

Lol you're gonna pull that thread and at the end of the sweater is gonna be the CIA or Russia.

Edit: in = is

[–] [email protected] 14 points 1 month ago

Did I stutter?

load more comments (2 replies)

[–] [email protected] 15 points 1 month ago

Where are the anonymous group and 4chan autists? They should attack these assholes. Attacking internet archive is like kicking a kitten. Everyone will hate you for it.

[–] [email protected] 80 points 1 month ago (3 children)

Why are people fucking with the Internet Archive? Who benefits?

[–] [email protected] 51 points 1 month ago* (last edited 1 month ago) (6 children)

People use Archive links to avoid giving sites traffic.

This is a problem for advertisers and media corps.

Not saying they're the ones doing this, but they'd definitely benefit.

[–] [email protected] 6 points 1 month ago

Wouldn't put it past them...

[–] [email protected] 4 points 1 month ago

I've enjoyed using Wayback Machine on journalistic articles where they try to retcon information, but the original copy had already been captured. The Ministry of Truth hates archive.org.

[–] [email protected] 3 points 1 month ago

Someone else looked to the group claiming responsibility for this. It's a pro-Palestinian Russian group

load more comments (3 replies)

[–] [email protected] 38 points 1 month ago (4 children)

Well right wingers want to ban books and services like IA make that harder since they provide easy access to download or digitally borrow those books. It makes it harder for them to deny people access to those books since they can find them online. Of course, there are other ways people can still obtain those books, IA isn't the only one, but it's the easiest and the most convent.

load more comments (4 replies)

[–] [email protected] 14 points 1 month ago

[–] [email protected] 54 points 1 month ago (1 children)

We need IA full mirrors. This is too critical to leave to this one company.

[–] [email protected] 40 points 1 month ago (2 children)

Knowing the folks at IA I'm sure they would love a backup. They would love a community. I'm sure they don't want to be the only ones doing this. But dang, they've got like 99 Petabytes of data. I don't know about you, but my NAS doesn't have that laying around...

[–] el_abuelo 11 points 1 month ago* (last edited 1 month ago) (2 children)

I wonder if someone can come up with some kind of distributed storage that isn't insanely slow. Kinda like a CDN but on personal devices. I'm thinking like SETI@HOME did with distributed compute.

Edit: this is kinda like torrents but where the contents are changing frequently.

[–] [email protected] 10 points 1 month ago (2 children)

You should look up IPFS! It's trying to be kinda like that.

It'll always be slower than a CDN, though, partly because CDNs pay big money to be that fast, but also anything p2p is always going to have some overhead while the swarm tries to find something. It's just a more complicated problem that necessarily has more layers.

But that doesn't mean it's not possible for it to be "fast enough"

[–] [email protected] 5 points 1 month ago

And there's a promising new IPFS-like system called Iroh, which should have a lot less overhead and in general just be faster than IPFS. It's not quite ready to just switch to right now, but an enterprising individual could probably make something useful with it without too much work (i.e. months, not years).

I'm using it for a distributed application project right now, but the intent is a bit different than the IA use-case.

load more comments (1 replies)

[–] [email protected] 5 points 1 month ago

Something like torrents. Split the whole thing in small 5gb torrents.

[–] [email protected] 3 points 1 month ago

That is an insane amount of storage. How much does it grow every year and is it stable growth or accelerating?

[–] [email protected] 47 points 1 month ago (3 children)

This again??

This time once archive.org is back online again... is it possible to get torrents of some of their popular data storage? For example I wouldn't imagine their catalog of books with expired copyright to be very big. Would love a community way to keep the data alive if something even worse happens in the future (and their track record isn't looking good now)

[–] [email protected] 16 points 1 month ago

Like this idea

[–] [email protected] 13 points 1 month ago (2 children)

Yep, that seems like the ideal decentralized solution. If all the info can be distributed via torrent, anyone with spare disk space can help back up the data and anyone with spare bandwidth can help serve it.

[–] [email protected] 8 points 1 month ago (2 children)

Most of us can't afford the sort of disk capacity they use, but it would be really cool if there were a project to give volunteers pieces of the archive so that information was spread out. Then volunteers could specify if they want to contribute a few gigabytes to multiple terabytes of drive space towards the project and the software could send out packets any time the content changes. Hmm this description sounds familiar but I can't think of what else might be doing something similar -- anyone know of anything like that that could be applied to the archive?

[–] [email protected] 5 points 1 month ago

Yeah, the projects I've heard about that have done something like this broke it into multiples.

For example, 1000GB could be broken into forty 25GB torrents and within that, you can tell the client to only download some of the files.

At scale, a webpage can show the seed/leach numbers and averages foe each torrent over a time period to give an idea of what is well mirrored and what people can shore up. You could also change which torrent is shown as the top download when people go to the contributor page and say they want to help host it ensuring a better distribution.

load more comments (1 replies)

[–] [email protected] 1 points 1 month ago (2 children)

There's an issue with torrents, only the most popular ones get replicated and the process is manual\social.

Something like Freenet is needed, which automatically "spreads" data over machines contributing storage, but Freenet is an unreliable storage, basically like a cache where older and unwanted stuff gets erased.

So it should be something like Freenet, but possibly with some "clusters" or "communities" with a central (cryptography-enabled) authority of each being able to determine the state of some collection of data as a whole, and pick priorities. My layman's understanding is that this would be similar to something between Freenet and Ceph, LOL. More like a cluster filesystem spread over many nodes, not like cache.

load more comments (2 replies)

[–] [email protected] 3 points 1 month ago

Anna’s Archive does this. I think its a really good way to make it difficult to take them down.

Hopefully this hack starts some conversations on how they can ensure longevity for their project. Seems they’re being attacked on multiple fronts now.

[–] [email protected] 44 points 1 month ago (1 children)

The majority of Reddit discourse on this is wild. The crowd there is going HARD to try and paint IA in the most negative light possible.

I know we don't like Reddit here, but for example: https://www.reddit.com/r/DataHoarder/comments/1g7w0rh/internet_archive_issues_continue_this_time_with/

It's almost as if the "hackers" and/or copyright holders are running that conversation.

[–] [email protected] 26 points 1 month ago

Since it's Reddit, I would guess copyright sockpuppets are steering the narrative to help damage them further.

[–] [email protected] 29 points 1 month ago

Not this crap again

[–] [email protected] 20 points 1 month ago

Wtf

[–] [email protected] 10 points 1 month ago* (last edited 1 month ago) (2 children)

Apparently, BlackMeta is behind the DDoS attack to the Internet Archive. Apparently they are pro-Palestine hacktivists - their X account also has some russian written in it.

(Edit) Also, Internet Archive is banned on China since 2012 and Russia since 2015.

[–] [email protected] 57 points 1 month ago (4 children)

Yes they are a "pro-Palestine" Russian based hacker group... Nothing funny going on here no sir

load more comments (4 replies)

[–] [email protected] 4 points 1 month ago* (last edited 1 month ago)

Definitely not their genocidal neighbors terrorizing as usual. /s

[–] [email protected] 6 points 1 month ago

Quick question for those more in the know: Have these events disrupted IA's ability to archive pages? I ask because I was recently talking with a security guy about a novel malware that used a hacked webpage for command injection. One possible motive that came to mind, if the archiving was disrupted would be to cover tracks for a similar malware. Inject code, perform malicious activity, revert, then, there's more time before the control code is discovered.

[–] [email protected] 2 points 1 month ago

Hope they had a backup

load more comments