this post was submitted on 13 Aug 2023
54 points (100.0% liked)
Technology
37801 readers
209 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
First I'd like to be a little pedantic and say LLMs are not chatbots. ChatGPT is a chatbot - LLMs are language models which can be used to build chatbots. They are models (like a physics model) of language, describing the causal joint probability distribution of language. ChatGPT only acts like an agent because OpenAI spent a lot of time retraining a foundation model (which has no such agent-like behavior) to model "language" as expressed by an individual. Then, they put it into a chatbot "cognitive architecture" which feeds it a truncated chat log. This is why the smaller models when improperly constrained may start typing as if they were you - they have no inherent distinction between the chatbot and yourself. LLMs are a lot more like broca's area than a person or even chatbot.
When I say they're "general purpose", this is more or less an emergent feature of language, which encodes some abstract sense of problem solving and tool use. Take the library I wrote to create "semantic functions" from natural language tasks - one of the examples I keep going to in order to demonstrate the usefulness is
a year ago, this would've been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a massive dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. With LLMs, I just tell the AI to do it and it... does. Just like that. I can ask it to do anything and it will, within reason and with proper constraints.
GPT-3 was the first generation of this technology and it was already miraculous for someone like me who's been following the AI field for 10+ years. If you try GPT-4, it's at least 10x subjectively more intelligent than ChatGPT/GPT-3.5. It costs $20/mo, but it's also been irreplaceable for me for a wide variety of tasks - Linux troubleshooting, bash commands, ducking coding, random questions too complex to google, "what was that thing called again", sensitivity reader, interactively exploring options to achieve a task (eg note-taking, SMTP, self-hosting, SSI/clustered computing), teaching me the basics of a topic so I can do further research, etc. I essentially use it as an extra brain lobe that knows everything as long as I remind it about what it knows.
While LLMs are not people, or even "agents", they are "inference engines" which can serve as building blocks to construct an "artificial person" or some gradiation therein. In the near future, I'm going to experiment with creating a cognitive architecture to start approaching it - long term memory, associative memory, internal thoughts, dossier curation, tool use via endpoints, etc so that eventually I have what Alexa should've been, hosted locally. That possibility is probably what techbros are freaking out about, they're just uninformed about the technology and think GPT-4 is already that, or that GPT-5 will be (it won't). But please don't buy into the anti-hype, it robs you of the opportunity to explore the technology and could blindside you when it becomes more pervasive.
What would AI have to do to qualify as "capable of some interesting new kind of NLP or can create something entirely new"? From where I stand, that's exactly what generative AI is? And if it isn't, I'm not sure what even could qualify unless you used necromancy to put a ghost in a machine...