this post was submitted on 30 May 2025
15 points (77.8% liked)

Programming

20466 readers
97 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 2 years ago
MODERATORS
top 5 comments
sorted by: hot top controversial new old
[–] [email protected] 4 points 2 days ago

Awesome use of LLMs. I wonder they didn’t use FP8 quantization though, especially since their target hardware was an L40s.

[–] onlinepersona 4 points 2 days ago (1 children)

What is a "kernel" in this context? It doesn't seem to be related to the OS kernel but some kind of graphics kernel? Whatever that is...

Anti Commercial-AI license

[–] bitfucker 5 points 2 days ago (1 children)

In the context of machine learning, usually a list of numbers that is arranged in a certain way, and is used for mathematical operation. You can think of it as a transfer/transform "function" that takes data as input, and spits out the representation of said data in some other way (that we usually don't know until the training is finished and we analyze the result)

[–] onlinepersona 1 points 1 day ago (1 children)

The weights for the neural network or the embeddings?

Anti Commercial-AI license

[–] bitfucker 3 points 1 day ago

No. Normally the kernel doesn't get updated in the network during training. They are called hyper-parameters. They do affect training, but they are not updated by the training algorithm