Technology

69247 readers

4262 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

155

Chinese algorithm claimed to boost Nvidia GPU performance by up to 800X for advanced science applications (www.tomshardware.com)

submitted 2 months ago by [email protected] to c/[email protected]

18 comments fedilink hide all child comments

"Scarcity is the mother of invention" …looks like optimisation is back on the menu boys, something the Americans have forgotten whether for games, softwares, apps etc.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 60 points 2 months ago (1 children)

Per the article, they just developed a faster algorithm for a specific type of material simulation.

[–] Isoprenoid 35 points 2 months ago (2 children)

Per the article, they ~~just~~ developed a faster algorithm for a specific type of material simulation.

You're underplaying it.

As per the article:

This has broad implications for industries that require detailed material analysis, including:

Aerospace and Defense: Improved modeling of material stress and failure in aircraft structures.

Engineering and Manufacturing: More efficient testing of materials for construction and industrial applications.

Military Research: Faster development of impact-resistant materials for defense systems.

[–] [email protected] 7 points 2 months ago* (last edited 2 months ago) (2 children)

Yes, exactly… For another example, the DeepSeek team developed their own replacement to CUDA with PTX (Parallel Thread Execution) a lower-level assembly-like language that allows for more granular optimisations of GPU performance offering 10X efficiency improvement recently as GPU sanctions were levied on China. This innovative approach not only challenges the dominance of CUDA in the AI landscape but also opens up new possibilities for optimizing GPU performance in various applications and this is what missing not only from those relying on Nvidia but its competitors whether AMD or Apple that prefers to have its own proprietary solutions.

[–] [email protected] 17 points 2 months ago (1 children)

Nvidia developed PTX, DeepSeek leveraged it to do some load balancing work they couldn't do in CUDA. They still also use CUDA.

[–] [email protected] 1 points 2 months ago

DeepSeek team developed their own replacement to CUDA called PTX (Parallel Thread Execution) a lower-level assembly-like language that allows for more granular optimisations of GPU performance offering 10X efficiency improvement recently as GPU sanctions were levied on China

No they didn’t. NVIDIA created PTX

[–] [email protected] 1 points 2 months ago

I write HPC and this really isn't that crazy. Its 800x vs a serial program ran on a CPU, it is only 100x a OpenMP (parallel CPU version).

While the gains are very impressive and show clear deep understanding of HPC programming, it is really nothing that out of the ordinary.

With proper memory optimisations and algorithm optimisations from a single thread to a dedicated GPU model a 100-1000x speed up is pretty normal.