RonSijm

joined 2 years ago
[–] RonSijm 4 points 4 months ago (1 children)

It probably depends on the level of the criminals and organized crime groups. I saw this Youtube video a couple weeks ago that talks about the history of how organized crime groups were using encrypted communication https://www.youtube.com/watch?v=gigIOc_0PKo (And how they were honey-potted by the FBI to use an FBI-hosted service, lol)

Organized crime groups that make 100s of millions should be capable enough to hire skilled developers and sysops to host self-managed services. At some point if they make enough money, investing in self-managed communication becomes preferable over using telegram or signal.

[–] RonSijm 19 points 4 months ago (3 children)

No one's questioning why he's sorting it twice?

[–] RonSijm 6 points 5 months ago (3 children)

Also some feedback, a bit more technical, since I was trying to see how it works, more of a suggestion I suppose

It looks like you're looping through the documents and asking it for known tags, right? ({str(db.current_library.tags)}.)

I don't know if I would do this through a chat completion and a chat response, there are special functions for keyword-like searching, like embeddings. It's a lot faster, and also probably way cheaper, since you're paying barely anything for embeddings compared to chat tokens

So the common way to do something like this in AI would be to use Vectors and embeddings: https://platform.openai.com/docs/guides/embeddings

So - you'd ask for an embedding (A vector) for all your tags first. Then you ask for embeddings of your document.

Then you can do a Nearest Neighbor Search for the tags, and see how closely they match

[–] RonSijm 2 points 5 months ago (1 children)

I'm a bit of a noob in hardware design, so maybe this is a stupid question, but why is a FPGA scary?

It would seem scarier to me if they actually fabbed an FPGA into an ASIC right? That could maybe indicate they have some kinda plan to mass-produce them, no?

[–] RonSijm 2 points 5 months ago* (last edited 5 months ago)

Although I agree with the sentiment - the article mentions that it's "only" regarding about 1 mil people. (Probably South Korean users)

So it's still a $15 fine per violation. Could have been much higher, sure, but I don't know if that's a good return of investment for Facebook.

Maybe this case sets an example for other countries or regulatory bodies to start issuing fines to Facebook as well

[–] RonSijm 4 points 5 months ago

I haven't used json(b) in a Spring app, so I can't say much about that.

Json vs Jsonb depends on the use-case. Inserting json is faster than inserting Jsonb. Reading json (based on searching for specific json properties) Jsonb is faster, because Jsonb is parsed into a more optimized tree.

From my experience, I don't really like doing selects based on json properties. If I know I'll be selecting a certain property, I usually add an additional column next to the json with the data, and insert that property there (At least in c#/dotnet, with EF) The frameworks don't have that much support for selecting within json (you can do it, it's just a lot more natively supported to use proper columns)

[–] RonSijm 5 points 5 months ago

Nice. Does that mean I can take my 1980s computer case back off the shelve, and finally get to use a Turbo button again?

[–] RonSijm 2 points 5 months ago

I'm not entirely sure what you hope to achieve: have a GPG encrypted subject, and have ThunderBird automatically understand that it's encrypted, so it can be automatically decrypted?

Since you're saying you're building software to support this, what are you building? A ThunderBird plugin that can do this? Or just standalone software that you want to make compatible with ThunderBird default way of handling encryption?

[–] RonSijm 12 points 5 months ago (4 children)

There's a Python WASM runtime, if you really want to run python in a browser for some reason...

https://github.com/wasmerio/wasmer-python

[–] RonSijm 56 points 5 months ago* (last edited 5 months ago)

Recruitment is now basically Dead Internet theory...

[–] RonSijm 5 points 6 months ago (1 children)

It gives an example:

For example, with the phrase “My favorite tropical fruits are __.” The LLM might start completing the sentence with the tokens “mango,” “lychee,” “papaya,” or “durian,” and each token is given a probability score. When there’s a range of different tokens to choose from, SynthID can adjust the probability score of each predicted token, in cases where it won’t compromise the quality, accuracy and creativity of the output.

So I suppose with a larger text, if all lists of things are "LLM Sorted", it's an indicator.

That's probably not the only thing, if it can detect a bunch of these indicators, there's a higher likelihood it's LLM text

view more: ‹ prev next ›