Technology

64933 readers

4568 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

514

People are speaking with ChatGPT for hours, bringing 2013’s Her closer to reality (arstechnica.com)

submitted 1 year ago by [email protected] to c/[email protected]

155 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 1 points 1 year ago (1 children)

Stupid newbie question here, but when you go to a HuggingFace LLM and you see a big list like this, what on earth do all these variants mean?

psymedrp-v1-20b.Q2_K.gguf 8.31 GB

psymedrp-v1-20b.Q3_K_M.gguf 9.7 GB

psymedrp-v1-20b.Q3_K_S.gguf 8.66 GB

etc...

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

That's called "quantization". I'd do some searching on that for better description, but in summary, the bigger the model, the more resources they need to run and the slower it will be. Models are 8bit, but it turns out, you still get really good results if you drop off some of those bits. The more you drop the worse it gets.

People have generally found, that it's better to have a larger data set model, with a lower quantization, than lower data set and the full 8bits

E.g 13b Q4 > 7b Q8

Going below Q4 is generally found to degrade the quality too much. So its' better to run a 7b Q8 then a 13b Q3, but you can play with that yourself to find what you prefer. I stick to Q4/Q5

So you can just look at those file sizes to get a sense of which one has the most data in it. The M (medium) and S (small) are some sort of variation on the same quantization, but I don't know what they're doing there, other than bigger is better.

[–] [email protected] 1 points 1 year ago

Thank you!!