Ask Lemmy
A Fediverse community for open-ended, thought provoking questions
Please don't post about US Politics.
Rules: (interactive)
1) Be nice and; have fun
Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them
2) All posts must end with a '?'
This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?
3) No spam
Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.
4) NSFW is okay, within reason
Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either [email protected] or [email protected].
NSFW comments should be restricted to posts tagged [NSFW].
5) This is not a support community.
It is not a place for 'how do I?', type questions.
If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email [email protected]. For other questions check our partnered communities list, or use the search function.
Reminder: The terms of service apply here too.
Partnered Communities:
Logo design credit goes to: tubbadu
view the rest of the comments
I love the idea, I much prefer it to the mainstream. The problem is, the typical process of documenting FOSS and self-host projects (websites, wiki, mailing lists, etc) move too slow and are too cumbersome for how quick things are developing right now. So people are kind of having to invent the new tech a d new ways to communicate about it, and they're not always making choices that either scale or are easy to find and reference.
Okay, since you seem to be so helpful here, I'll lay out where I'm at. I've been using LLMs like ChatGPT, Copilot, and Bard more professionally. I find them equal parts useful, confusing, annoying, and skeevey. I've got a lil VPS I run for services, I could put a front end on there easy. I've also got an old 8core Xeon machine with like 48GB ram and a leftover AMD R9 270 sitting there with Unraid barely installed. I can chamge the OS of course, but what am I realistically looking at being able to run locally that won't go above like 60-75% usage so I can still eventually get a couple game servers, network storage, and Jellyfin working? I'll be honest I don't care about image generation much, but if I do I can always look into upgrading
Honestly, not much. Llama 8B, but very slowly, or maybe deepseek v2 chat, preprocessed on the 270 with vulkan but mostly running on CPU. And I guess just limit it to 6 threads? I'd host it with kobold.cpp vulkan, or maybe the llama.cpp server if there will be multiple users.
You can try them to see if they feel OK, but llms are just not something that like old hardware. An RTX 3060 (or a Mac, or a 12GB+ AMD GPU) is considered bare minimum in the community, a 3090 or 7900 XTX standard.