Programming

20104 readers

61 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]

founded 2 years ago

MODERATORS

snowe

Ategon

[email protected]

Can anybody explain why CUDA and Rocm are necessary and why OpenCL isn't the solution? (self.programming)

submitted 1 month ago by onlinepersona to c/programming

18 comments fedilink hide all child comments

I've read multiple times that CUDA dominates, mostly because NVIDIA dominates. Rocm is the AMD equivalent, but OpenCL also exists. From my understanding, these are technologies used to program graphics cards - always thought that shaders were used for that.

There is a huge gap in my knowledge and understanding about this, so I'd appreciate somebody laying this out for me. I could ask an LLM and be misguided, but I'd rather not 🤣

Anti Commercial-AI license

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 3 points 1 month ago

Because CUDA and ROCm/HIP are far easier to program.

The Khronos competitor to CUDA/ROCm is SYCL not OpenCL.

SYCL vs these other options is a fun theoretical problem, but only Intel seems to be pushing SYCL at all. OpenCL got stuck in OCL1.2 (the 2.0 release was dead. 3.0+ OpenCL ignores OCL2.0 but it's too late, OpenCL is seen as a dead end tech these days).

The biggest issue is that OpenCL is a different language, while CUDA/HIP/SYCL are 'just' C++ extensions. This means that if you ever shared data between CPU and GPU in OpenCL (or DirectX or Vulkan for that matter), you have to carefully write and rewrite structs{} to line up between the two languages.

Meanwhile, CUDA/HIP support passing structs, classes and more between CPU and GPU (subject to conditions of course. GPUs can't do function pointers or vtables for example, but cpu-only classes can have vtables)