this post was submitted on 24 Jun 2023
19 points (82.8% liked)
Programming
17685 readers
118 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
My favorite test for ChatGPT is to ask it to write a function to divide two numbers in 6502 assembly. Not only is there no DIV instruction to rely on, but the 6502 is very register starved, so you get a lot of screwups. Here's one example of what ChatGPT generated:
You can see it immediately overwrites the divisor with the quotient, so this thing will always give a divide by zero error. But even if it didn't do that,
CMP X,A
is an invalid instruction. But even if that wasn't invalid, multiplying the dividend by two (and adding one) is nonsense.Honestly I still don't get it. Every dialog with ChatGPT where I tried to do something meaningful always ends with ChatGPT hallucinations. It answers general questions, but it imagine something everytime. I asks for a list of command line renderers, it returns list with a few renderers that do not have CLI interface. I asks about library that do something, it returns 5 libraries with one library that definitely can't do it. And so on, so on. ChatGPT is good on trivial task, but I don't need help with trivial task, I can do trivial task myself... Sorry for a rant.
That’s what (most) people don’t understand. It’s a language model. It’s not an expert system and it’s not a magical know-it-all oracle. It’s supposed to give you an answer like a random human would do. But people trust it much more as they would trust a random stranger, because “it is an AI”…
No you aren't the only one. I've prompted ChatGPT before for SFML library commands and it's given me commands that either don't work anymore or just never existed everytime.
That's because ChatGPT and LLM's are not oracles. They don't take into account whether the text they generate is factually correct, because that's not the task they're trained for. They're only trained to generate the next statistically most likely word, then the next word, and then the next one...
You can take a parrot to a math class, have it listen to lessons for a few months and then you can "have a conversation" about math with it. The parrot won't have a deep (or any) understanding of math, but it will gladly replicate phrases it has heard. Many of those phrases could be mathematical facts, but just because the parrot can recite the phrases, doesn't mean it understands their meaning, or that it could even count 3+3.
LLMs are the same. They're excellent at reciting known phrases, even combining popular phrases into novel ones, but even then the model lacks any understanding behind the words and sentences it produces.
If you give an LLM a task in which your objective is to receive factually correct information, you might as well be asking a parrot - the answer may well be factually correct, but it just as well might be a hallucination. In both cases the responsibility of fact checking falls 100% on your shoulders.
So even though LLMs aren't good for information retreival, they're exceptionally good at text generation. The ideal use-cases for LLMs thus lie in the domain of text generation, not information retreival or facts. If you recognize and understand this, you're all set to use ChatGPT effectively, because you know what kind of questions it's good for, and with what kind of questions they're absolutely useless.
I've only ever done X86 Assembly. But oh lord that does not look like it can really do much. Yet still somehow has like 20 lines.