this post was submitted on 18 Dec 2024

65 points (88.2% liked)

Linux

52939 readers

385 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
No misinformation
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago

MODERATORS

[email protected]

zcat shouldn't error out if you try to zcat an uncompressed file, it should just output the damned file ! (lemmy.ml)

submitted 3 months ago by [email protected] to c/[email protected]

29 comments fedilink hide all child comments

There I said it !

top 29 comments

sorted by: hot top controversial new old

[–] [email protected] 25 points 3 months ago (1 children)

I agree. zgrep also works for uncompressed files, so we could use e.g. zgrep ^ instead of zcat.

[–] [email protected] 12 points 3 months ago

Thanks, didn't know that existed

That's basically everything I was looking for !

[–] [email protected] 14 points 3 months ago (2 children)

Yeah, it's a pain. Leads to bad one liners:

for i in $(ls); do zcat $i || cat $i; done

[–] [email protected] 10 points 3 months ago* (last edited 3 months ago) (2 children)

Btw, don't parse ls. Use find |while read -r instead.

find -maxdepth 1 -name "term" -print |while read -r file
   do zcat "$file" 2>/dev/null || cat "$file"
done

[–] [email protected] 4 points 3 months ago (2 children)

Won't this cause cat to iterate through all files in the cwd once zcat encounters an issue, instead of just the specific file?

[–] [email protected] 1 points 3 months ago

Yeah, i was tired and had $file there first, then saw that you wanted to cat all in directory. Still tired, but i think this works now.

[–] [email protected] 1 points 3 months ago* (last edited 3 months ago)

You are correct. This probably produces something more similar to what you'd want the original command to do, but with better safely:

find -- . -type f -regex '^\./[^/]*$' -exec sh -c -- 'for file in "${@}"; do zcat "${file}" || cat "${file}" || exit; done' sh '{}' '+'

That assumes you want to interact with files with names like .hidden.txt.gz though. If you don't, and only intend to have a directory with regular files (as opposed to directories or symbolic links or other types of file), using this is much simpler and even safer, and avoids using files in a surprising order:

for i in *; do zcat -- "$i" || cat -- "$i" || exit; done

Of course, the real solution is to avoid using the Shell Command Language at all, and to carefully adapt any program to your particular problem as needed: https://sipb.mit.edu/doc/safe-shell/

[–] [email protected] 1 points 3 months ago* (last edited 3 months ago) (1 children)

You can just do for f in * (or other shell glob), unless you need find's fancy search/filtering features.

The shell glob isn't just simpler, but also more robust, because it works also when the filename contains a newline; find .. | while read -r will crap out on that. Also apparently you want while IFS= read -r because otherwise read might trim whitespace.

If you want to avoid that problem with the newline and still use find, you can use find -exec or find -print0 .. | xargs -0, or find -print0 .. | while IFS= read -r -d ''. I think -print0 is not standard POSIX though.

[–] [email protected] 1 points 3 months ago (1 children)

because it works also when the filename contains a newline

Doesn't that depend on the shell?

[–] [email protected] 1 points 3 months ago

I don't think so and have never heard that, but I could be wrong.

[–] [email protected] 2 points 3 months ago* (last edited 3 months ago)

Thanks !

But still we shouldn't have to resort to this !

~~Also, can't get the output through pipe~~

for i in $(ls); do zcat $i || cat $i; done | grep mysearchterm

~~this appears to work~~

~~find . -type f -print0 | xargs -0 -I{} sh -c 'zcat "{}" 2>/dev/null || cat "{}"' | grep "mysearchterm"~~

~~Still, that was a speed bump that I guess everyone dealing with mass compressed log files has to figure out on the fly because zcat can't read uncompressed files ! argg !!!~~

for i in $(ls); do zcat $i 2>/dev/null || cat $i; done | grep mysearchterm

[–] [email protected] 10 points 3 months ago (1 children)

How do you propose zcat tell the difference between an uncompressed file and a corrupted compressed file? Or are you saying if it doesn't recognize it as compressed, just dump the source file regardless? Because that could be annoying.

[–] [email protected] 2 points 3 months ago (1 children)

Even a corrupt compressed files has a very different structure relative to plain text. "file" already has the code to detect exactly which.

Still, failing on corrupted compression instead of failing on plaintext would be an improvement.

[–] [email protected] 4 points 3 months ago (1 children)

What even is plain text anymore? If you mean ASCII, ok, but that leaves out a lot. Should it include a minimal utf-8 detector? Utf-16? The latest goofy encoding? Should zcat duplicate the functionality of file? Generally, unix-like commands do one thing, and do it well, combining multiple functions is frowned upon.

[–] [email protected] 0 points 3 months ago

I wouldn't call all this hoop jumping to reading common log files "doing it better".

This is exactly the kind of arcane tinkering that makes everything a tedious time wasting chore on linux.

At this point it's accepted that text files get zipped and that should be handled transparently and not be precious about kilobits of logic storage as if we were still stuck on a 80386 with 4 megs of ram.

[–] fool 4 points 3 months ago* (last edited 3 months ago) (1 children)

just use -f lol.

less $(which zcat) shows us a gzip wrapper. So we look through gzip options and see:

-f --force
Force compression or decompression. If the input data is not in a format recognized by gzip, and if the option --stdout is also given, copy the input data without change to the standard output: let zcat behave as cat.

party music

[–] [email protected] 2 points 3 months ago

That works great now I can zcat -f /var/log/apache2/*

[–] [email protected] 4 points 3 months ago (1 children)

I think that providing an exit status that is not 0 when zcat is used with an uncompressed file is useful. Though my opinion is less strong regarding whether it should write more text after an error occurred, it's probably more useful for a process to terminate quickly when an error occurred rather than risk a second error occurring and making troubleshooting harder.

I think that trying to change any existing documented features of widely used utilities will lead to us having less useful software in the future (our time is probably better spent making new programs and new documentation): https://www.jwz.org/doc/worse-is-better.html https://en.wikipedia.org/wiki/Worse_is_better

[–] [email protected] 0 points 3 months ago (2 children)

Not improving existing software leads to stagnation.

It's certainly a good part of why so much of linux is an awkward kludgy idiosyncratic mess to use.

Whatever the first implementation does ends up being a suicide pact by default.

Another option is to change cat to auto decompress compressed files, instead of printing gibberish.

[–] [email protected] 2 points 3 months ago

What operating system should I use with my laptop that isn't an awkward kludgy idiosyncratic mess? I would say that Windows has plenty of kludges, like having problems with certain file names. Many versions of macOS are UNIX® Certified Products (for example, macOS version 15.0 Sequoia on Intel-based Mac computers and on Apple silicon-based Mac computers), so it's surely not any less kludgy than Linux.

I suppose that it's not bad to change documentation to be more specific, and change a program such that it matches the new documentation and wouldn't cause any harm if it replaced all the existing versions of the program, but makes it possible to use the program to solve more problems. That would be to "add functionality in a backward compatible manner".

You are also free to create new programs that are not an exact replacement for existing programs, but can enable some people to stop using one or more other programs. That would not be what I describe as stagnation.

"The cat utility shall read files in sequence and shall write their contents to the standard output in the same sequence.", so I would be very annoyed if it did something different with a certain file but not others. I wouldn't say that the contents of a file and the contents after the file is expanded are the same. In fact, I expect that some people use cat to process compressed files, and changing how cat acts with compressed files would probably cause them a large amount of annoyance, and would needlessly make a lot of existing documentation incorrect.

[–] [email protected] 1 points 3 months ago (1 children)

Whatever the first implementation does ends up being a suicide pact by default.

I agree. The behavior of rm and cat and cp and mv and dd and many other utilities don't necessarily have the interface I would prefer, but they are too widely used for it to be helpful to radically change them. It's somewhat unfortunate that these names are already reserved, but I don't think it's necessary to change them.

In the same way, I don't have a problem with packages having generic names but not actually being useful: I've read that the requests and urllib3 packages for Python aren't being maintained very well, but I don't mind that as long as I can accomplish things while following best practices.

Because of this, I'm not afraid to use names like "getRequest" or "result", especially if they were generated with an automatic refactoring, and I'm not upset when I see similarly generic names being used with source code I'm changing, since I know that the second name for something that's similar to an existing thing will have to actually be descriptive, but the first name is likely to not be.

I have another example of how I'd apply these thoughts: the process for developing v2+ modules for the Go programming language strikes me as inelegant, so I would probably prefer to just create an entirely new repository rather than try to attempt that.

[–] [email protected] 1 points 3 months ago (1 children)

Well in this particular case, zcat failing with error on uncompressed text isn't a behaviour worth preserving.

It should do the expected zcat behaviour, which is just print the text.

I have a hard time imagining a scenario where you call zcat and would prefer an error rather than a useable output

[–] [email protected] 1 points 3 months ago

I already expressed that quickly getting an exit status that isn't 0 after an issue is encountered is probably useful.

I can imagine that someone would find a program like this to be useful, and depends on the presently common behavior of zcat, so I expect this is an important part of a system used by a corporation I interact with (and probably many more than I'd expect):

if
    zcat ./file.txt.gz >/dev/null
then
    process_file ./file.txt.gz
else
    printf '%s\n' "There was a decimal exit status of ${?}"
fi

A failure to understand whether something is useful is not a good reason to change it.

[–] [email protected] 2 points 3 months ago

Celeste. Are you here? In a future search maybe?

[–] [email protected] -4 points 3 months ago (2 children)

Well, the source code is available. Fix it if you need it that bad.

[–] fool 10 points 3 months ago

Man, I have a minor inconvenience.

installs Gentoo

[–] [email protected] 1 points 3 months ago (1 children)

Where is it? I can't seen to find it https://github.com/zCat?tab=repositories

[–] SteveTech 8 points 3 months ago* (last edited 3 months ago) (1 children)

It's part of GNU Gzip, and zcat is basically just a shell script that runs exec gzip -cd "$@" meaning you can actually just do cat /usr/bin/zcat to get the source.

[–] [email protected] 2 points 3 months ago

Or even zcat -f /usr/bin/zcat