this post was submitted on 06 Sep 2024
615 points (90.6% liked)
linuxmemes
20880 readers
6 users here now
I use Arch btw
Sister communities:
- LemmyMemes: Memes
- LemmyShitpost: Anything and everything goes.
- RISA: Star Trek memes and shitposts
Community rules
- Follow the site-wide rules and code of conduct
- Be civil
- Post Linux-related content
- No recent reposts
Please report posts and comments that break these rules!
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Case-sensitive is easier to implement; it's just a string of bytes. Case-insensitive requires a lot of code to get right, since it has to interpret symbols that make sense to humans. So, something over wondered about:
That's not hard for ASCII, but what about Unicode? Is the precomposed ç treated the same lexically and by the API as Latin capital letter c + combining cedilla? Does the OS normalize all of one form to the other? Is ß the same as SS? What about alternate glyphs, like half width or full width forms? Is it i18n-sensitive, so that, say, E and É are treated the same in French localization? Are Katakana and Hiragana characters equivalent?
I dunno, as a long-time Unix and Linux user, I haven't tried these things, but it seems odd to me to build a set of character equivalences into the filesystem code, unless you're going to do do all of them. (But then, they're idiosyncratic and may conflict between languages, like how ö is its letter in the Swedish alphabet.)
Yeah the us defaultism really shows here.
More characters than Ascii? Surely you must be mistaken.
This thread is giving me flashbacks to the times before Unicode, when swapping files between Windows and Linux partitions would have a good chance of fucking up every non-ASCII characters in their names.
There was ways to set it up so the ISO character sets would match, but it was still a giant pain to deal with different ones.
Blessed be Unicode.
A related issue I still see very often, even with files newly created just this year, is when trying to extract zip files on my Linux systems that contain non-ASCII filenames and that were created on Windows systems, especially ones with apparently non-English locales like Japanese. Need to trial and error the locale I give to unzip and sometimes hack together fixed names with iconv until the mojibake seems to fix itself.