vcmj

joined 1 year ago
[–] vcmj 8 points 1 year ago* (last edited 1 year ago)

True that, he did good on that front for a while though. He got too confident

[–] vcmj 22 points 1 year ago (8 children)

I remember hearing a while back that Musk made an executive decision at Tesla to not use LIDAR. I thought: "That's a stupid decision. At least invest in making it better if you think its not sufficient" and I had a quite negative view of his engineering abilities ever since. Seeing as a Tesla can be fooled by a projector these days, I'm willing to die on that hill. I will admit that he is an exceptional businessman, most people would piss away a fortune if given one, but an engineer he is not, not by a loooooooong way.

[–] vcmj 2 points 1 year ago (1 children)

I've not played with it much but does it always describe the image first like that? I've been trying to think about how the image input actually works, my personal suspicion is that it uses an off the shelf visual understanding network(think reverse stable diffusion) to generate a description, then just uses GPT normally to complete the response. This could explain the disconnect here where it cant erase what the visual model wrote, but that could all fall apart if it doesn't always follow this pattern. Just thinking out loud here

[–] vcmj 4 points 1 year ago* (last edited 1 year ago) (1 children)

Thanks for the detailed reply, I see that I did indeed misunderstand what he was saying. I'm an R&D engineer so I guess my knee jerk response to character level mischief is exactly what you said, it can't see them anyway, I already knew that so I dismissed that possible interpretation in my mind straight out the gate. Maybe I should assume zero knowledge of internal AI workings reading commentary in the wild.

Edit: Actually just thought of a good analogy for this. Say I play a sound and then ask you what it is of. You might reply "it sounds like a bell", but if I asked exactly the composition of frequencies that made the sound, you might not be able to say. Similarly the AI sees a group of letters as a definite "thing" (token) but it doesn't know what actually went into that because its "ears"(tokenizer) already reduced it to a simpler signal.

[–] vcmj 1 points 1 year ago (4 children)

?? Literally the entire purpose of the transformer architecture is to manipulate text, how is it bad at that? Am I misunderstanding this? Summarization, thematic transformation, language translation etc are all things AI is fantastic at...

view more: ‹ prev next ›