this post was submitted on 30 Jan 2024
13 points (100.0% liked)

Technology

37742 readers
501 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

The sha1 hash for 64test64xa is 6779c53432b8badf049bb9d8924a5785dd887243 which is 41 characters only using hexadecimal, 10digits and 6letters. But how long it would be if it was using the whole 26 letters in the latin alphabet? What if it also differentiated between UPPER and lower cases?

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 31 points 9 months ago

So, hexadecimal uses 16 characters. Each character stores 4 bits of data (2⁴ = 16).

If you use the 10 digits and 26 letters of the Latin alphabet, the resulting encoding is called Base36.

It is a rather impractical format for storing data, though, because for purposes of simple conversion, the number of possibilities should be a power of 2 -- that way a program can do (quick) bit shifts instead of (difficult, especially on big numbers) division to determine which character to use. That's why it's mostly used to encode numbers, and not large sequences of data.

Base32 is a slightly-smaller variant that can fit 5 bits of data into one character. (2⁵ = 32)

If you add up digits, uppercase and lowercase characters together (differentiating between upper and lower case), you get 62. This is also an impractical number for computer purposes. But add two extra characters and you get 64, which is another nice power of two (2⁶ = 64), letting one character store 6 bits. And Base64 is a common encoding scheme for data.


And when you know how many bits a character can fit, you can calculate how "efficient" the encoding will be and how many characters will be needed to store data. A Base32 encoding will need 20% fewer characters than hexadecimal, and Base64 needs 33.3% fewer.