this post was submitted on 20 Feb 2024
165 points (95.1% liked)
Technology
58303 readers
26 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
We made a tag that can't be reliably and deterministically scanned so we also included a machine learning model that takes a good guess at it.
I just don't see how you could possibly rely on a black box model for anything important. You have no way to mathematically prove if there are collisions in the model output or not, and newer versions of the model can't be made backwards compatible. So if you have a database of thousands of these tags scanned, then they discover a critical vulnerability and provide a new model, you're SOL and everything you have is worthless.
Can you imagine your house doorknob had to think about the shape of your key before letting you in, and then have the possibility of just saying "No. Not today."?
If there were collisions in the output you'd see them while scanning those thousands of entries. And if they release a new model you can use it going forward and keep scanning the old items with the old one.
This happens in inventory sometimes, new technology comes out, you have to update asset tags.
Tell me you've never developed commercial security software without telling me. "If it works a few thousand times without collisions it should be reliable enough". That's not even good enough for tamper proof seals on medication and yogurt jars let alone applications that require the sender and recipient to use a dedicated tetrahertz scanner to validate.
.... Damn AI fanboys smh
Nobody said anything about security applications, lol. It's a proof of concept and you're getting all worked up over complete hypotheticals. Where did you even get the idea that it would have collisions within thousands?
In security applications you need to account for actors with a marked interest in causing a collision but in an inventory scenario you simply generate IDs randomly until you get one that's not a duplicate. There's no problem using a hash algorithm with collisions if the probability is small enough. There are tons of scientific labs using MD5, btw.
It's used to identify similarities in glue patterns. In what way wouldn't this be backwards compatible? New versions would just be better at it.