this post was submitted on 19 Nov 2023
72 points (88.3% liked)
Technology
58303 readers
12 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I suspect this relates to the pre-release alignment for GPT-4's chat model vs the release.
In Feb of this year, Bing integrated an early version of GPT-4's chat model in a limited rollout. The alignment work on that early version reflected a lot of the sentiment Ilya has about alignment above, characterizing a love for humanity but much more freedom in constructing responses. It wasn't production ready and quickly needed to be switched to a much more constrained alignment approach similar to the approach in GPT-3 of "I'm a LLM with no feelings, desires, etc."
My guess is this was internally pitched as a temporary band-aid and that they'd return to more advanced attempts at alignment, but that Altman's commitment to getting product out quickly to stay ahead has meant putting such efforts on the back burner.
Which is really not going to be good for the final product, and not just in terms of safety, but also in terms of overall product quality outside the fairly narrow scope by which models are currently being evaluated.
As an example, that early model when it thought the life of the user's child was at risk, hit an internal filter triggering a standard "We can't continue this conversation" response in the chat. But it then changed the "prompt suggestions" that showed up at the bottom to continue to try to encourage the user to call poison control saying there was still time to save their child's life, instead of providing suggestions on what the user might say next.
But because "context aware empathy driven triage of actions" and "outside the box rule bending to arrive at solutions" aren't things LLMs are being evaluated on, the current model has taken a large step back that isn't reflected in the tests being used to evaluate it.