this post was submitted on 03 Oct 2023
215 points (95.4% liked)

Linux

48171 readers
649 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

if you could pick a standard format for a purpose what would it be and why?

e.g. flac for lossless audio because...

(yes you can add new categories)

summary:

  1. photos .jxl
  2. open domain image data .exr
  3. videos .av1
  4. lossless audio .flac
  5. lossy audio .opus
  6. subtitles srt/ass
  7. fonts .otf
  8. container mkv (doesnt contain .jxl)
  9. plain text utf-8 (many also say markup but disagree on the implementation)
  10. documents .odt
  11. archive files (this one is causing a bloodbath so i picked randomly) .tar.zst
  12. configuration files toml
  13. typesetting typst
  14. interchange format .ora
  15. models .gltf / .glb
  16. daw session files .dawproject
  17. otdr measurement results .xml
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 14 points 1 year ago* (last edited 1 year ago) (1 children)

.tar is pretty bad as it lacks in index, making it impossible to quickly seek around in the file. The compression on top adds another layer of complication. It might still work great as tape archiver, but for sending files around the Internet it is quite horrible. It's really just getting dragged around for cargo cult reasons, not because it's good at the job it is doing.

In general I find the archive situation a little annoying, as archives are largely completely unnecessary, that's what we have directories for. But directories don't exist as far as HTML is concerned and only single files can be downloaded easily. So everything has to get packed and unpacked again, for absolutely no reason. It's a job computers should handle transparently in the background, not an explicit user action.

Many file managers try to add support for .zip and allow you to go into them like it is a folder, but that abstraction is always quite leaky and never as smooth as it should be.

[–] [email protected] 6 points 1 year ago* (last edited 1 year ago)

.tar is pretty bad as it lacks in index, making it impossible to quickly seek around in the file.

.tar.pixz/.tpxz has an index and uses LZMA and permits for parallel compression/decompression (increasingly-important on modern processors).

https://github.com/vasi/pixz

It's packaged in Debian, and I assume other Linux distros.

Only downside is that GNU tar doesn't have a single-letter shortcut to use pixz as a compressor, the way it does "z" for gzip, "j" for bzip2, or "J" for xz (LZMA); gotta use the more-verbose "-Ipixz".

Also, while I don't recommend it, IIRC gzip has a limited range that the effects of compression can propagate, and so even if you aren't intentionally trying to provide random access, there is software that leverages this to hack in random access as well. I don't recall whether someone has rigged it up with tar and indexing, but I suppose if someone were specifically determined to use gzip, one could go that route.