datahoarder

6851 readers

3 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 5 years ago

MODERATORS

[email protected]

Downloading all archive.org metadata (lemmy.dbzer0.com)

submitted 1 month ago* (last edited 1 month ago) by [email protected] to c/[email protected]

7 comments fedilink hide all child comments

I'd love to know if anyone's aware of a bulk metadata export feature or repository. I would like to have a copy of the metadata and .torrent files of all items.

I guess one way is to use the CLI but this relies on knowing which item you want and I don't know if there's a way to get a list of all items.

I believe downloading via BitTorrent and seeding back is a win-win: it bolsters the Archive's resilience while easing server strain. I'll be seeding the items I download.

Edit: If you want to enumerate all item names in the entire archive.org repository, take a look at https://archive.org/developers/changes.html. This will do that for you!

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 2 points 1 month ago (1 children)

I like to pipe my json to python -m json.tool for quick formatting in the terminal.

[–] CHKMRK 1 points 1 month ago

Take a look at jq, it's a really nice tool for handling json in the terminal, also gron for searching json