Code Review

154 readers

1 users here now

Welcome to the programming.dev code review community! This is a place where you can review other people's code or request reviews on your own

Before posting please read the sections below (click on one to expand it)

What is a code review?

A code review is a process where people who did not write a piece of code look over it to find errors, suggest improvements, and make sure it follows coding standards

When doing a code review look for these aspects:

Correctness: Does the code behave as expected and meet the requirements?
Readability: Is the code easy to understand and maintain?
Efficiency: Are there more efficient ways to accomplish the same task?
Best Practices: Does the code follow established coding standards and best practices?
Security: Are there potential security vulnerabilities?
Scalability: Will the code perform well as the application scales?

Post Guidelines

Guidelines

Put the programming language you used for your code in square brackets ahead of your post title. e.g. [Python], [React], [C#], etc.
Provide a link to a spot where people can view your code (e.g. github, etc.) in your post body
In your post title put a small description of what the thing is that you want reviewed and then you can expand on that in more details if you want in your post body

Rules

Any code review requests you post after your first must be done after you have reviewed two other people's code (and then you need another two for each request after that as well).

Youre free to post your first code review request in this community though without needing to follow this

Credits

Icon base by Lorc under CC BY 3.0 with modifications to add a gradient

Wormhole

[email protected]

founded 1 year ago

MODERATORS

Vacant

[Rust] Implementation of an algorithm I invented - RANDEVU (lemm.ee)

submitted 6 months ago* (last edited 6 months ago) by [email protected] to c/code_review

10 comments fedilink hide all child comments

I came up with an algorithm and implemented it in Rust.
I want to ensure the code is of sufficient quality before releasing it on Crates.io and GitHub. You can also find more info about the algorithm and its purpose there.

I'm eager to hear your feedback and thoughts and am open to ANY suggestions that may help me improve the code further.
The code should be performant, clean, easy to use, idiomatic, etc.
I'm also looking for potentially better names for certain variables and stuff.

Note that some things have changed a bit since the last release (for example - the hash calculation has been changed from blake3(blake3(OBJECT) || blake3(DATE)) to blake3::keyed_hash(DATE, OBJECT) to improve performance by eliminating 2 out of 3 hash calculations), but the README is still valid (and hopefully at least somewhat decent source of info about the algorithm - I did my best trying to explain stuff, but I'm still looking to improve it further in the future).

The functions can panic if they are given a date where the year is outside of the 0000 to 9999 range.
I'm not sure if I should make the function return the result/option instead of panicking since I'd like to avoid having to unwrap() the output and would prefer if the function returned just a simple u32/chrono::NaiveTime (but I'm open to changing my mind on this one).

I'm also curious if there is a cleaner way to create a key from a date. Previously, I've been using format("%Y-%m-%d") to convert it into a string slice and copy it into the key array, but I've found that approach to be very slow (over 50% of the CPU time was being spent just for that alone), so I opted out for an approach you can see below.

The newest function rdvt() is not yet documented anywhere, but I'll do my best to explain it here: In addition to an object and a date (same as rdv()), it also takes a rank (which is a positive integer - u32).
The function calculates the keyed blake3 hash of an object, using the date and a rank as key, and uses the resulting (pseudorandom) bits to generate and return a random uniform time (chrono::NaiveTime) between 0h and 24h (down to nanosecond precision).
Time is calculated by taking the first bit of the hash and in case it's a binary one - 12h is added to the time, then we add 6h if the 2nd bit of the hash is a one, 3h for the 3rd bit, 1.5h for 4th and so on until the increment reaches the small enough value where it doesn't contribute anything to the time (when it becomes less than 1ns, essentially). This means if all of the bits in the hash were zeros - time would be zero, and if they were all ones - time would be 23:59:59:999:999:999h (the very last and highest possible value). The code short-circuits and stops earlier than going through all 256 bits since we usually only need around 46 bits before the increment becomes smaller than 1ns (the code stops only in case the sum of tiny sub 1ns increments can't contribute enough to change the last digit in the total time (even if all of the rest of the bits in the hash were to be ones))):

Feel free to ask ANY questions regarding the code, the algorithm, its function, use cases, or anything else you'd like explained or clarified.

Here's the code (I haven't written any tests yet, so they are not included for review):

//! The official Rust implementation of the [RANDEVU](https://github.com/TypicalHog/randevu) algorithm
//!
//! # Example
//! ```rust
//! use chrono::Utc;
//! use randevu::{rdv, rdvt};
//!
//! fn main() {
//!     let object = "THE_SIMPSONS";
//!     let date = Utc::now();
//!     let rdv = rdv(object, &date);
//!     let rdvt = rdvt(0, object, &date);
//!
//!     println!("Object {} has RDV{} today with RDVT0 at {:?}", object, rdv, rdvt);
//! }
//! ```

use blake3;
use chrono::{DateTime, Datelike, NaiveTime, TimeDelta, Utc};
use itoa;

/// Returns the 32-byte KEY `[u8; 32]` created from a given DATE `&DateTime<Utc>` and an optional RANK `Option<u32>`
fn create_key(date: &DateTime<Utc>, rank: Option<u32>) -> [u8; 32] {
    let mut key = [0u8; 32];

    let mut year = Datelike::year(date);
    let mut month = Datelike::month(date);
    let mut day = Datelike::day(date);

    let mut year_len = 4;
    let mut prefix_len = 0;

    // Add a prefix (-/+) if the year is not between 0 and 9999 (-YYYY-MM-DD / +YYYY-MM-DD)
    if year < 0 {
        key[0] = b'-';
        prefix_len = 1;

        year = year.abs(); // Make year positive
    } else if year > 9999 {
        key[0] = b'+';
        prefix_len = 1;
    }

    // Adjust year_len for very large years (both positive and negative)
    if year > 9999 {
        year_len += 1;
        if year > 99999 {
            year_len += 1;
        }
    }

    let full_year_len = prefix_len + year_len;

    // If a rank is provided, write it into the key after the date, separated by an '_'
    if rank != None {
        let mut buffer = itoa::Buffer::new();
        let rank_str = buffer.format(rank.unwrap());
        key[7 + full_year_len..7 + full_year_len + rank_str.len()]
            .copy_from_slice(&rank_str.as_bytes()[..rank_str.len()]);

        key[6 + full_year_len] = b'_';
    }

    // Write the day into the key
    key[5 + full_year_len] = b'0' + (day % 10) as u8;
    day /= 10;
    key[4 + full_year_len] = b'0' + day as u8;

    key[3 + full_year_len] = b'-';

    // Write the month into the key
    key[2 + full_year_len] = b'0' + (month % 10) as u8;
    month /= 10;
    key[1 + full_year_len] = b'0' + month as u8;

    key[full_year_len] = b'-';

    // Write the year into the key
    for i in (prefix_len..full_year_len).rev() {
        key[i] = b'0' + (year % 10) as u8;
        year /= 10;
    }

    key
}

/// Returns the RDV value `u32` for an OBJECT `&str` on a specific DATE `&DateTime<Utc>`
///
/// **RDV = number of leading zero bits in blake3::keyed_hash(key: DATE, data: OBJECT)**
pub fn rdv(object: &str, date: &DateTime<Utc>) -> u32 {
    let hash = blake3::keyed_hash(&create_key(date, None), object.as_bytes());

    // Count the number of leading zero bits in the hash
    let mut rdv = 0;
    for &byte in hash.as_bytes() {
        rdv += byte.leading_zeros();

        if byte != 0 {
            break;
        }
    }

    rdv
}

/// Returns the RDVT time `DateTime<Utc>` of a given RANK `u32` for an OBJECT `&str` on a specific DATE `&DateTime<Utc>`
pub fn rdvt(rank: u32, object: &str, date: &DateTime<Utc>) -> DateTime<Utc> {
    let hash = blake3::keyed_hash(&create_key(date, Some(rank)), object.as_bytes());

    // Calculate the time using bits from the hash
    let mut total: f64 = 0.0;
    let mut increment = 12.0 * 60.0 * 60.0 * 1_000_000_000.0; // 12h in nanoseconds
    for (i, byte) in hash.as_bytes().iter().enumerate() {
        for j in (0..8).rev() {
            let bit = (byte >> j) & 1;
            if bit == 1 {
                total += increment;
            }
            increment /= 2.0;
        }
        // Stop once increments become too small to affect the total
        if i > 4 && (2.0 * increment) < (1.0 - total.fract()) {
            break;
        }
    }

    // Construct the RDVT time from total
    let rdvt = date.with_time(NaiveTime::MIN).unwrap() + TimeDelta::nanoseconds(total as i64);

    rdvt
}

you are viewing a single comment's thread
view the rest of the comments

[–] sukhmel 3 points 6 months ago* (last edited 6 months ago) (9 children)

I haven't finished reading the post yet, but I must say: consider publishing it on GitHub (or any other public repository) as it is, no need to make the code super ideal first. Collaboration is exactly what those platforms are good for.

Edit: I mistook you text as not desiring to publish at all before refining. I see that you already have a version published, maybe a branch with updated code could do?

Edit2: NaiveDate omits timezone, I would expect that to cause problems if the code is used to get same rendezvous times independently on different devices

Edit3: instead of bit arithmetic I would've used a precomputed table or a macro that generates the same code as a table. So that it will be something like

match byte {
    0 => {}
    1 => /* add increment*/
    …
    8 => /* add increment*(2 - 1/2⁷) */
}
/* divide increment by 2⁸ */

Maybe it makes sense to do bitshift instead of dividing, maybe it makes sense to instead go from lowest to highest and multiply, but then stop condition should be calculated in advance

Edit4: also, I don't understand the stopping condition of 2*increment > 1 - total.fraq()

[–] ExperimentalGuy 1 points 6 months ago (4 children)

I don't know if it's because I'm stupid but I read through this post and the GitHub and I still don't really understand what it does. Could you eli5?

[–] sukhmel 1 points 6 months ago (2 children)

It lets several people choose the same time for contacting each other when everyone knows some specific event name.

At least I think so, maybe there's more to it, I wasn't very attentive

[–] ExperimentalGuy 1 points 6 months ago (1 children)

Oh so it's like knowing the password for a repeating event?

[–] sukhmel 1 points 6 months ago

Maybe ¯\_(ツ)_/¯

load more comments (1 replies)

load more comments (5 replies)