Programming

19366 readers

134 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe

Ategon

MaungaHikoi@lemmy.nz

Simply explained: how does GPT work? (confusedbit.dev)

submitted 2 years ago by sizeoftheuniverse to c/programming

16 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] qwop 8 points 2 years ago* (last edited 2 years ago) (1 children)

I think calling it just like a database of likely responses is too much of a simplification and downplays what it is capable of.

I also don't really see why the way it works is relevant to it being "smart" or not. It depends how you define "smart", but I don't see any proof of the assumptions people seem to make about the limitations of what an LLM could be capable of (with a larger model, better dataset, better training, etc).

I'm definitely not saying I can tell what LLMs could be capable of, but I think saying "people think ChatGPT is smart but it actually isn't because <simplification of what an LLM is>" is missing a vital step to make it a valid logical argument.

The argument is relying on incorrect intuition people have. Before seeing ChatGPT I reckon if you'd told people how an LLM worked they wouldn't have expected it to be able to do things it can do (for example if you ask it to write a rhyming poem about a niche subject it wouldn't have a comparable poem about in its dataset).

A better argument would be to pick something that LLMs can't currently do that it should be able to do if it's "smart", and explain the inherent limitation of an LLM which prevents it from doing that. This isn't something I've really seen, I guess because it's not easy to do. The closest I've seen is an explanation of why LLMs are bad at e.g. maths (like adding large numbers), but I've still not seen anything to convince me that this is an inherent limitation of LLMs.

[–] qwertyasdef 1 points 2 years ago

Agreed, smartness is about what it can do, not how it works. As an analogy, if a chess bot could explore the entire game tree hundreds of moves ahead, it would be pretty damn smart (easily the best in the world, probably strong enough to solve chess) despite just being dumb minmax plus absurd amounts of computing power.

The fact that ChatGPT works by predicting the most likely next word isn't relevant to its smartness except as far as its mechanism limits its outputs. And predicting the most likely next word has proven far less limiting than I expected, so even though I can think of lots of reasons why it will never scale to true intelligence, how could I be confident that those are real limits and not just me being mistaken yet again?