this post was submitted on 05 Aug 2023

14 points (85.0% liked)

Selfhosted

42541 readers

576 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

[email protected]

Looking for beginner friendly free or cheaper solutions for a chatbot (lemmy.world)

submitted 2 years ago by [email protected] to c/[email protected]

15 comments fedilink hide all child comments

Hello all, you all seem to be well versed in this stuff and I can't seem to find many ai or chatbot communities at all. Anyway, I've just been using the free, outdated gpt3.5 on the openai site and it really opened my eyes to how useful of a tool it is. I mean it helps me with everything, especially computer related and troubleshooting... But of course it has its flaws. Its limited in capabilities, knowledge cutoff at 2021, assuming its highly restrictive and censored as well.

I would love a solution, probably gpt4, but I entered the hugging face rabbit hole and now I'm completely lost lmao. What are some good options for someone who wants basically same functions of the free chatgpt but up to date, smarter, and just better. I'm willing to put in some work to build something, if its not too tricky. I get lost easily at this technical level. Or an out of box option is preferred. I have no clue about models other than gpt4 and I guess that's what I'm looking for or something better for either free or a lower cost than openai's $20 a month. And preferably something more private and less restrictive for the nsfw and darker questions lol

I hope this ramble makes some sense at least. I just opened my eyes to this field and its pretty deep. As of now, newb friendly,user friendly for someone with some OK tech/computing knowledge. Thanks

all 17 comments

sorted by: hot top controversial new old

[–] [email protected] 14 points 2 years ago (1 children)

GPT4ALL sounds like your best bet. It's one of the easiest to set up solutions at the moment. As best as I can tell none of the open and local options are gpt4 level yet. that's it, there are lots of models to play with and they seem to be getting better very quickly. GPT4ALL makes it pretty simple I specifically linking the models that work and helping you download them.

the one downside is that I don't believe they have implemented GPU models yet. that means things are easy to set up, but it's going to be a slow experience, especially if you don't have a really beefy CPU and lots of RAM.

[–] [email protected] 2 points 2 years ago (1 children)

Ahh darn that probably won't work then, but I'll still give it a look. Thanks

[–] [email protected] 10 points 2 years ago (1 children)

Unfortunately there isn't any free/open source LLM that's as good as ChatGPT. The ones that are here like Llama 2 and Vicuna 1.5 are decent, but are pretty dumb and love spouting random incorrect info and are really unreliable. The biggest problem isn't even that, it's that hosting one yourself requires a TON of resources. The higher-end models (still not as good as ChatGPT) can need over 20gb of VRAM. For reference an RTX 4090 has 24gb vram.

You might have better luck trying alternatives to ChatGPT. I personally use Claude, it's less censored and has knowledge up to 2023.

[–] [email protected] 1 points 2 years ago (1 children)

Claude sounds perfect then. I came across the model but don't know anything. This is all brand new to me. Is Claude the same question and answer/general/all in one type of chat like gpt? And is its own platform or would it take some coding and developing and APIs and all that confusing stuff lol

[–] [email protected] 5 points 2 years ago (2 children)

Yup, claude is general-purpose and has its own website: https://claude.ai/

Unlike ChatGPT it also doesn't need a phone number.

[–] [email protected] 3 points 2 years ago

I’ve been using https://perplexity.ai which requires nothing other than a prompt.

[–] [email protected] 1 points 2 years ago (1 children)

Very awesome. I haven't dove in yet, but is it its own model or does it use gpt4 or something else?

[–] [email protected] 2 points 2 years ago

It's its own model created by the company itself.

[–] [email protected] 4 points 2 years ago* (last edited 2 years ago) (1 children)

Originally posted this to beehaw on another account:

Oobabooga is the main GUI used to interact with models.

https://github.com/oobabooga/text-generation-webui

FYI, you need to find checkpoint models. In the available chat models space, naming can be ambiguous for a few reasons I'm not going to ramble about here. The main source of models is Hugging Face. Start with this model (or get the censored version):

https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GGML

First, let's break down the title.

This is a model based in Meta's Llama2.
This is not "FOSS" in the GPL/MIT type of context. This model has a license that is quite broad in scope with the key point stipulating it can not be used commercially for apps that have more than 700 million users.
Next, it was quantized by a popular user going by "The Bloke." I have no idea who this is IRL but I imagine this is a pseudonym or corporate alias given how much content is uploaded by this account on HF.
This model is based on a 7 Billion parameter dataset, and is fine tuned for chat applications.
This is uncensored meaning it will respond to most inputs as best it can. It can get NSFW, or talk about almost anything. In practice there are still some minor biases that are likely just over arching morality inherent to the datasets used, or it might be coded somewhere obscure.
Last part of the title is that this is a GGML model. This means it can run on CPU or GPU or a split between the two.

As for options on the landing page or "model card"

you need to get one of the older style models that have "q(numb)" as the quantization type. Do not get the ones that say "qK" as these won't work with the llama.cpp file you will get with Oobabooga.
look at the guide at the bottom of the model card where it tells you how much ram you need for each quantization type. If you have a Nvidia GPU with the CUDA API, enabling GPU layers makes the model run faster, and with quite a bit less system memory from what is stated on the model card.

The 7B models are about like having a conversation with your average teenager. Asking technical questions yielded around 50% accuracy in my experience. A 13B model got around 80% accuracy. The 30B WizardLM is around 90-95%. I'm still working on trying to get a 70B running on my computer. A lot of the larger models require compiling tools from source. They won't work directly with Oobabooga.

[–] [email protected] 2 points 2 years ago (1 children)

Wow man that's all very interesting. I do not think I can figure it out though. Its more complex than I thought it'd be.

[–] [email protected] 1 points 2 years ago

There may be other out of the box type solutions. This setup really isn't bad. You can find info on places like YT that are step by step for Windows.

If you are at all interested in learning about software and how to get started using a command line, this would be a good place to start.

Oobabooga is well configured to make installation easy. It just involves a few commands that are unlikely to have catastrophic errors. All of the steps required are detailed in the README.md file. You don't actually need to know or understand everything I described in the last message. I described why the model is named like x/y/z if you care to understand. This just explained details I learned by making lots of mistakes. The key here is that I linked to the model you need specifically and tried to explain how to choose the right file from the linked model. If you still don't understand, feel free to ask. Most people here remember what it was like to learn.

[–] [email protected] 2 points 2 years ago

I found

pi.ai

more reliable than gpt. It seems to have data about recent events

[–] [email protected] 2 points 2 years ago* (last edited 2 years ago) (1 children)

I’ve come across a FOSAI (free open source AI) community a few times - give it a search, there are several getting started type posts.

Non lemmy link

https://lemmy.world/c/[email protected]

[–] [email protected] 1 points 2 years ago

Nice thanks alot