this post was submitted on 18 Nov 2023
1 points (100.0% liked)

Data Hoarder

0 readers
3 users here now

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time (tm) ). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

founded 1 year ago
MODERATORS
 

what's the archive type with the bext size compression (lowest size after archiving) but that has partial extraction (extracting specific files) ?

top 6 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 1 year ago (1 children)

Depends on your data, but there are two major contenders for that title: 7z (with solid mode off) and zpaq. You will probably get slightly better compression on zpaq, but it's not widely known.

[–] [email protected] 1 points 1 year ago

I tried with zpaq but it told ne that archive type did not support partial extraction

[–] [email protected] 1 points 1 year ago (1 children)

That is kind of inconsequential as you can always compress the files individually if you wish and then make a tar with all of them together.

The question is what files you have, based on that various algorithms would do better or worse. And of course not doing solid archives would add a penalty to most algorithms if the files are somehow similar.

[–] [email protected] 1 points 1 year ago

images and videos

mostly jpg png mp4 webm

[–] [email protected] 1 points 1 year ago

It's dependent on dataset. I would suggest 7z and simply uncheck "solid archive". There is info here on running a test to find the best compression: Link

You may want to look into filesystem compression. As it will be much easier to implement and may suit your needs.

[–] [email protected] 1 points 1 year ago

Been awhile since I've looked, but you might consider pixz:

https://github.com/vasi/pixz